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Introduction: Voicing in Dutch 


Jeroen van de Weijer & Erik Jan van der Torre 
Leiden University Centre for Linguistics 


This volume focuses on the phonology, phonetics and psycholinguistics of voic- 
ing-related phenomena in Dutch. Dutch phonology has played a touchstone role in 
the past few decades where competing theories regarding laryngeal representation 
have been concerned. The intricacy of different rules manipulating values for the 
distinctive feature [voice], sometimes from [+voice] to [—voice] and back again, 
have sparked off different debates, among other things with respect to rule order- 
ing and the ‘arity’ of the feature [voice], which are currently still in full swing. 
Outside such discussions about segmental structure proper, processes like final 
devoicing have played a role in discussions about “evolutionary phonology” 
(Blevins 2004), where this process is related to differences between stops and 
fricatives, vowel length and differences in place of articulation (Blevins 204: 
103ff). All of these factors play a role in some of the articles in this volume. 

This volume adds fuel to these debates on several fronts, both on the level of 
the facts that competing analyses must account for and by critically examining 
different analyses that have been proposed. First, the article by Zonneveld reviews 
the facts of the standard language and presents an overview of formal approaches, 
from rule-based generative phonology-style ones to various recent OT-based 
analyses using local conjunction. It lays out the facts regarding the paradoxical 
facts of the behaviour of the past tense morpheme in Dutch, and the problems this 
poses for these different approaches. It also presents interesting new material from 
loanword data and the way these are incorporated, with special attention to voice. 
Finally, it presents a new OT analysis relying on local conjunction and positional 
faithfulness which overcomes the problems of past analyses. Importantly, this 
analysis is able to maintain a monovalent feature [voice]. 

An area of controversy in the literature is which feature should be based to ex- 
press voicing contrasts in different languages. For ‘aspiration’ languages such as 
English and German, the feature [spread glottis] seems adequate while 
(pre)voicing languages such as Dutch would seem to require the distinctive fea- 
ture [voice]. For both features it is possible to argue about the question whether 
they are binary or unary and whether —if binary— they are initially underspecified 
or not. This makes predictions about acquisition, in particular with respect to the 
question which member of a pair of consonants is expected to be acquired first, 
and which error patterns are expected under any of these approaches. This is the 
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topic of the contribution by René Kager, Suzanne van der Feest, Paula Fikkert, 
Annemarie Kerkhoff and Tania S. Zamuner, who investigate these questions for 
the three languages mentioned above, and conclude that the facts of acquisition 
indeed point to differential specifications for voicing languages and for aspiration 
languages. A number of other factors are important in this debate, viz. the role of 
phonetics (in terms of articulatory effort) and the role of other processes that 
might interfere with the pattern of errors that children make, in particular conso- 
nant harmony. 

The third paper, by Marc van Oostendorp, investigates a hitherto unreported 
aspect of the Dutch voicing rules, viz. the fact that in certain dialects there appear 
to be exceptions to final devoicing. While devoicing has been investigated from a 
phonetic point of view and has (sometimes) been found to be incomplete in pho- 
netic detail, certain dialects appear to show systematic exceptions in the syn- 
chronic phonology. These exceptions are well-defined: they take place in the case 
of final labial and velar fricatives in the first person plural. A historical explana- 
tion is that these dialects have recently lost (or still variably have) a first person 
morpheme which ‘protects’ the final consonant from undergoing devoicing. Syn- 
chronically, there are two alternative ways of approaching this: one based on 
paradigmatic uniformity and one based on abstract underlying representations, 
both of which present certain problems. It is hoped that facts like these, possibly 
complemented by other dialectal variations on the theme of voicing, and their 
analysis, will play a role in future discussions about the facts of Dutch. 

Petra M. van Alphen describes the exact phonetic realization of the voiced 
stops in Dutch, offering an introduction to the phonetic side of the voicing distinc- 
tion in Dutch. She shows that vocal cord vibration, which is usually assumed to 
accompany voiced plosives, is frequently absent in these sounds. Surprisingly, it 
is still possible for Dutch listeners to recognize voiced plosives compared to 
voiceless plosives. This means that other acoustic cues must be available that aid 
the perception of voiced plosives, and it entails that voicing is indeed, phoneti- 
cally, a gradient category. 

Wouter Jansen explores the thin (or non-existent) line between phonetics and 
phonology, in an exploration of the facts of regressive voicing assimilation. He 
shows that regressive assimilation indeed does take place, but that it has all the 
hallmarks of a ‘low-level’ phonetic process, more akin to a coarticulatory effect 
than a ‘real’ phonological rule. The question therefore arises in which component 
of the grammar it should be accounted for. 

In the final paper of this volume, Mirjam Ernestus and Harald Baayen take up 
the fact, referred to above, that final devoicing in Dutch presents a case of pho- 
netically incomplete neutralization (cf. also Port & Leary 2005, where this point is 
taken as a frontal attack on the main premises of generative phonology). On the 
basis of a perception experiment, they show that listeners rated different plosives 
differently according to whether they alternated between voiced and voiceless or 
not. They take this as evidence that listeners activate morphologically related 
words when accessing a particular form of a paradigm. If these forms have conso- 
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nants with different values for [voice] (i.e. if they alternate), the resulting sound 
will be a ‘compromise’ between voiced and voiceless. 

We hope that these papers will serve to describe the state of the art in the pho- 
nology and phonetics of Dutch voicing, and to spark off new descriptive, theoreti- 
cal and experimental research. 
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Issues in Dutch Devoicing 
Positional Faithfulness, Positional Markedness, 
and Local Conjunction 


Wim Zonneveld 
Utrecht Institute of Linguistics OTS 


The voice phenomena of Dutch are among the most complex and elusive described 
in the (theoretical) literature of the past few decades (Lombardi 1999, Wetzels & 
Mascaro 2001). This paper starts by giving a comprehensive overview of the perti- 
nent facts, and then shows how theories such as the rule-based framework and non- 
linear Principles and Parameter theory have struggled to come to grips with them. 
The point of the paper’s second half is a demonstration of how in Optimality The- 
ory, Lombardi’s theory of voice is well equipped to cover Dutch, specifically its 
awkward language-specific phenomena of fricative devoicing and past tense pro- 
gressive assimilation, assuming the mechanism of local conjunction (and, in fact, 
self-conjunction), involving the in itself uncontroversial *LAR (‘no voice (on an ob- 
struent)’) constraint. Pace earlier accounts, the proposed analysis preserves the pri- 
vativity of the feature [voice], and — within OT — the ‘positional faithfulness’ spirit 
of Lombardi’s approach to voice. 


1. Introduction 

Once upon a time (not so long ago), Final Devoicing was a simple phonologi- 
cal process. Presentations involved straightforward alternations such as (Dutch) 
han{[t]/hand-en ‘hand(s)’ (or a German, Polish, Russian, or Catalan equivalent), 
accompanied by the distributional observation that anyway voiced obstruents 
failed to occur at the end of words, by a phonological rule in the standard format, 
by the argument that the rule could not be formulated ‘the other way around’ be- 
cause of non-alternating forms such as kant/kant-en ‘side(s)’, and by examples of 
linear ordering involving rules such as regressive assimilation. Currently the same 
process is a heatedly debated phenomenon, inviting all manner of theoretical and 
empirical questions. This paper discusses two topics of primary interest from 
Standard Dutch, a language figuring prominently in the current debate. These are, 
first, the interaction between a variety of devoicing processes, both syllable- 
finally and in clusters, and, second, the special status of the past tense. Analyses 
of these notoriously recalcitrant phenomena will be presented in Optimality The- 
ory, using well-known work by Lombardi as a point of departure. Two OT- 
theoretical issues will turn out to be relevant. The first is that of ‘positional faith- 
fulness’ (PF) and (/or) ‘positional markedness’ (PM), where the analysis proposed 
follows Lombardi (2001) and Alderete (2003) in assuming that “both PF and PM 
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constraints a[re] integral parts of the constraint component CON [for] different 
types of evidence” (Alderete 2003:150). Second, because the UG mechanism of 
local conjunction is central to it, the proposed account will be followed by a brief 
discussion of ‘metaconstraints’ on local conjunction. (Devoicing being the em- 
pirically central notion, the account’s logic will entail conjoining *LAR (a.k.a. 
*[-son, +voice]) with appropriate other constraints.) Referring to work by Fuka- 
zawa & Lombardi (2003) and Ito & Mester (2003), it is argued that ‘M&F’, ‘mul- 
tiple’ and ‘self-’conjunction are involved in an explanation of the Dutch facts, 
contributing to a programme that aims to avoid “a large and unwieldy innate 
component [by] analyz[ing] more complex constraints as combinations of simpler 
constraints” (Fukazawa & Lombardi 2003:196). 

This paper’s exposition starts with three chronologically consecutive types of 
analysis, proposed within the rule-based framework, non-linear phonology, and so 
far within Optimality Theory (henceforth OT), in that order. The historical sketch! 
is especially suited to highlight the intricate empirical properties of the Dutch sys- 
tem, and the recurring analytical issues. Sections 2 and 3 introduce the basic gen- 
eralizations, analysed in a classical rule-based manner. Section 4 describes a 
(mainly Lombardi’s) non-linear rule-and-repair type analysis in a Principles & 
Parameters framework, and identifies the typological position of Dutch. Section 5 
gives an OT translation of this analysis gleaned from several publications by 
Lombardi; but this translation remains incomplete where some of the voicing gets 
tough. Sections 6 and 7 discuss two recent OT analyses by Grijzenhout and 
Kramer, which cover more complete ground; it will be concluded that these ac- 
counts are unsatisfactory, and fall foul of both theoretical and empirical problems. 
Sections 8 and 9 present this paper’s new analysis of the Dutch facts, extending 
brief and informal suggestions by Lombardi (1997) into an OT conjunction ac- 
count. 


2. The basic system 

Introducing the major patterns of Dutch voicing assimilation, consider the 
early generative rules in (1-4) below, and the data in (5) (all of these are com- 
pound words, which relatively straightforwardly give insight into the basic sys- 
tem; see Trommelen & Zonneveld 1979, van der Hulst 1980, Booij 1995, Heems- 
kerk & Zonneveld 2000). 


(1) FINAL DEVOICING (henceforth: FINDEV) 
[-son] > [-voice] / $ 


(2) REGRESSIVE VOICING ASSIMILATION (henceforth: RVA) 
[—son] — [a voice] / (#) [a voice] 


(3) FRICATIVE DEVOICING (henceforth: FRICDEV) 
a. [-son, +cont] — [—voice] / [-son] (#) 

or: 

b. [-son, +cont] — [—voice] / [—voice] (#) 
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(4) DEGEMINATION Ci (#) Ci > CG; 


(5) a. ijk ijk-en ijk-punt [k-p] ‘benchmark’ 
straf straff-en straf-kamp _ [f-k] ‘penal colony’ 
hand [-t] hand-en hand-palm _[t-p] ‘palm of the hand’ 
proef proev-en proef-tijd [f-t] ‘probation’ 
strop stropp-en strop-das [b-d] ‘necktie’ 
lach lach-en lach-bui [y-b] ‘laughing fit’ 
bloed [-t] bloed-en bloed-bank [d-b] ‘blood bank’ 
huis huiz-en huis-baas [z-b] ‘landlord’ 

b. druk drukk-en druk-fout [k-f] ‘printing error’ 
trompet trompett-en trompet-solo [t-s] ‘trumpet solo’ 
zorg [-x] zorg-en [-y] zorg-sector  [y-s] ‘social services’ 
rib [-p] ribb-en rib-fluweel — [p-f] ‘(meedle) cord’ 
grijp grijp-en grijp-graag  [p-x] “grabby’ 
rond [-t] rond-e rond-vaart _[t-f] ‘round trip by boat’ 
drijf drijv-en drijf-zand __[f-s] ‘quicksand’ 
huis huiz-en huis-vuil [s-f] ‘household refuse’ 


The leftmost column of (5) illustrates FINDEV, and the second column the under- 
lying value of the stem-final obstruent before a vowel-initial suffix. The crucial 
criterion distinguishing (5a) from (5b) is that the (a)-cases have right-hand plo- 
sives in the internal cluster,’ whereas the (b)-cases have right-hand fricatives: the 
cluster voice values depend on this difference. 

A number of ‘ordering relations’ hold between the members of the rule set. 
First, FINDEV operates internally in compounds when the second member is sono- 
rant-initial: 


(6)  goud [+t] goud-en goud-ader [-t] ‘gold vein’ 
hand [-t] hand-en hand-rem [-t] ‘handbrake’ 
grond [-t] grond-en grond-wet [-t] ‘constitution’ 
schrob[-p] | schrobb-en schrob-net[-p] ‘trawl net’ 
zuig [-y]  zuig-en[-y] zuig-nap [-x] ‘sucking disc’ 


This implies that, using ordered rules, FINDEV precedes RVA: 


(7) /strop#das/ /hand#palm/  / bloed#bank / / strop#das / 
FIND vac. t t RVA b 
RVA b vac. d FIND *p 


Second, (3) above contains two versions of the rule of FRICDEV, both of which 
capture the data in (5b). Their ordering relations differ, however: 
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(8) a. FRICATIVE DEVOICING: context [—son] 


/ kerk#vorst / / rond#vaart / / kerk#vorst / / rond#vaart/ 
FRICD f f FIND vac. t 
FIND vac. t FRIVD f f 
RVA vac. vac. RVA vac. vac. 


b. FRICATIVE DEVOICING: context [—voice] (redundantly [-son]) 


/ kerk#vorst / / rond#vaart / / kerk#vorst / / rond#vaart/ 
RVA g vac. RVA vac. t 
FIND k t FRICD f f 
FRICD f f FIND vac. vac. 


When FRICDEv’s context mentions [—son] as in (8a), the rule is a statement about 
the distribution of fricatives, rather than one of assimilation (which is (8b)). But 
given FINDEV, voiceless clusters are always ensured, and the only ordering re- 
quirement is that the (a)-version of FRICDEV precede RVA. The assimilation ver- 
sion of FRICDEV must follow FINDEv. RVA can take two positions, as indicated 
in (8b). The shape-independent ordering of FRICDEV is the one at the right in both 
(8a,b).* 

Finally, the rule set includes DEGEMINATION, which reduces sequences of 
identical consonants: 


(9) _ sprint-en sprint-titel ‘sprint title’ tt — [t] 
kans-en kans-spel ‘game of chance’ _s-s — [s] 
spoed-ig spoed-debat ‘emergency debate’ d-d— t-d — d-d— [d] 
zwijg-en zwijg-geld ‘hush money’ Y-Y> ~v¥ > ~~ Lyd 
sport-en sport-dag ‘sports day’ t-d > d-d = [d] 
proev-en proef-fabriek ‘pilot plant’ v-f > f-f — [f] 
dans-en dans-zaal ‘dance hall’ S-Z > s-s — [s] 


As formulated in (1), FINDEV operates at the syllable- rather than just the 
word-boundary (Booij 1977, Kooij 1980, Zonneveld 1994). This is supported by 
the examples in (10) with a word-internal obstruent before a syllable break, which 
lack a pronunciation difference corresponding to the spelling difference: 


(10) [-t$] ad-miraal ‘admiral’, bad-minton, cad-mium, kid-nap, med-ley, 
ord-ner ‘folder’, pred-nison, Bod-nar, Broad-way, Nebukad-nezar, 
Sverd-lovsk 
cf.: at-las, at-leet ‘athlete’, but-ler, et-nisch ‘ethnic’, fit-ness, part-ner; 
alba-tros ‘albatross’, ma-tras ‘mattress’, a-drenaline ‘adrenalin’ 
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[-p$] (Aha-)Erleb-nis, Ab-ner, Chleb-nikov, Leib-nitz, Zimbab-we 
: hyp-nose ‘hypnosis’, rohyp-nol 
A-pril, com-plex, bi-bliotheek ‘library’, koli-brie ‘humming bird’ 


[-f$] Dubrov-nik, Pav-lov 
cf.: Daph-ne, Heff-ner 
A-frika, moe-flon, ‘moufflon’, li-vrei ‘livery’, na-vrant ‘heartrend- 
ing’ 


[-y$] dog-ma, enig-ma, frag-ment, pyg-mee ‘pygmy’, mag-neet 
‘magnet’, preg-nant 
cf.: drach-me ‘drachma’, strych-nine, tech-niek ‘technology’ 
fla-grant, i-glo ‘igloo’, man-grove, pel-grim ‘pilgrim’, retro-grade 


Thus in Dutch Britney and Sidney are a perfectly rhyming pair. These data imply 
that FINDEV follows the syllabification procedure of the grammar, independently 
ensuring that plurals like han$d-en coexist with singulars like han[t]. 

The rule set in (1-4) is pervasive in the language. In addition to the examples 
provided so far, the regressive assimilation vs. fricative devoicing pattern occurs 
inside underived words, too, as shown in (11) (examples taken from a slightly dif- 
ferent discussion in Zonneveld 1983). 


(11)a. _plosive right fricative right 
pasta ‘pasta’ labda ‘lambda’ fatsoen ‘decency’ 
octrooi ‘patent’ anekdote [gd] ‘anecdote’ rapsodie _— ‘rhapsody’ 
wodka [tk] ‘vodka’ dukdalf [gd] ‘mooring post’ taxi ‘taxi’ 
dochter ‘daughter’ asbest [zb] ‘asbestos’ fosfor ‘phosphorus’ 
moskee ‘mosque’ rugby [yb] ‘rugby’ ischias [sy] ‘sciatica’ 


- but no cases like: *fa[dz]oen 


b. _ plosive right fricative right 
feest ‘party’ fees$t-en (pl.) fiets ‘bicycle’ _fiet$-sen (pl.) 
kiosk ‘kiosk’ kios$k-en (pl.) arts ‘doctor’ art$s-en (pl.) 
hoofd ‘head’ hoo[v]$d-en (pl.) loods ‘pilot’ loo[t]$s-en (pl.) 
smaragd ‘emerald’ smara[y]d-en(pl.) eclips ‘eclipse’ — eclip$s-en (pl.) 
abt ‘abbot’ ab$d-ij ‘abbey’ koets ‘coach’ koet$s-ier ‘coachman’ 


- but no cases like: fie[ts] ~ *fie[d$z]-en 


The data in (12) below briefly illustrate the applicability of the rules at the phrase 
level (the “ symbol indicates the assimilation site; cf. Nespor & Vogel 1986, Booij 
& Rubach 1987, Trommelen 1993, Menert 1994, Ernestus 2000). 
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(12) klap#deur — [b-d] ‘swing door’ 
- Ik klap 4 deur 5 open [b-d] 
[lit.: I swing door #5 open] 


rond#vaart _[t-f] ‘round trip (by boat)’ 
- Het schijnt dat hij rond 4 vaart [t-f] 
[lit.: It seems that he around sails] 


wand#tegel [(d)-t] ‘wall tile’ 
- Hij wil dat ik de wand “ tegel [t] 
{lit.: He wants that I the wall tile] 


Finally’, as affixed forms play an important role in the remainder of this paper, 
their properties are discussed separately in the next section. 


3. Affixed forms 

The list in (13) below exhaustively mentions all relevant syllabic (vowel- 
containing) suffixes, showing that the voicing rule set operates when the rightmost 
obstruent of the cluster is affixal (Booij 1977, Trommelen & Zonneveld 1979); 
just as in (5), the data are subdivided into (a)- and (b)-cases depending on the 
manner feature of the right-hand obstruent: 


(13)° a. rijk-e ‘rich, INFL’ rijk-dom [g-d] -domN ‘richness’ 
paus-en ‘popes’ paus-dom = [z-d] ‘papacy’ 
liev-e ‘sweet, INFL’ lief-de [v-d] -deN ‘love’ 
zess-en ‘sixes’ zes-de [z-d] -deNum ‘sixth’ 
vijv-en ‘fives’ vijf-de [v-d] ‘fifth’ 
wijd-e ‘wide, INFL’ wijd-te [(d)-t] -te N ‘width’ 
ruig-e [y-] ‘rough, INFL’ ruig-te [y-t] ‘roughness’ 
berg-en [y-] ‘mountains’ ge-berg-te [y-t] ge-X-teN ‘mountain range’ 
Elizabeth ‘Elisabeth’ Els-ke [k-t] -keN(names) ‘little Lizzy’ 
denk-en ‘to think’ denk-baar [g-b] -baar A ‘jmaginable’ 
buig-en [y-] ‘to bend’ buig-baar [y-b] ‘pliable’ 

b. hand-en ‘hands’ hand-zaam _ [t-s] -zaam A ‘manageable’ 
volg-en [y-] ‘to follow’ volg-zaam _ [yx-s] ‘obedient’ 
vriend-en ‘friends’ vriend-schap[t-s]  -schapN ‘friendship’ 
broed-en ‘to hatch’ broed-sel _ [t-s] -sel N ‘hatch’ 
stijv-en ‘to starch’ stijf-sel [f-s] ‘starch’ 
acht-en ‘eights’ acht-ste [t-s] | -steNum ‘eighth’ 
twintig-en [y-] ‘twenties’ twintig-ste [y-s] ‘twentieth’ 
leid-en ‘to lead’ leid-ster [t-s] -ster N ‘leader, FEM’ 
schrijv-en ‘to write’ schrijf-ster [f-s] ‘writer, FEM’ 


Next, (14) lists all relevant — but less numerous — syllabic prefixes, to the same 
effect: 


(14) _ brand-en 
vangen 
vader 
in-dicatie 
re-solutie [—z] 
verbaal 
de-vies 
re-ductie 
in-structie 
trans-missie 
trans-lucide 
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‘to burn’ ont-branden 
“to catch’ ont-vangen 
‘father’ aarts-vader 
‘indication’ ab-dicatie 
‘resolution’ ab-solutie 
‘verbal’ ad-verbium 
‘motto’ ad-vies 
‘reduction’ ob-ductie 
‘instruction’ ob-structie 
‘transmission’ trans-vaal 
‘translucid’ trans-ductie 


[d-b] 
[t-f] 
[ts-f] 
[b-d] 
[p-s] 
[t-f] ad- 
[t-f] 
[b-d] 
[p-s] 
[t-f] 
[z-d] 


ont- 


ab- 


ob- 


trans- 


‘to ignite’ 
‘to receive’ 
‘patriarch’ 
‘abdication’ 
‘absolution’ 
‘adverb’ 
‘advice’ 
‘autopsy’ 
‘obstruction’ 
‘Transvaal’ 
‘transduction’ 


When a vowelless obstruent suffix is attached to an obstruent-final stem, the result 


is a completely voiceless final obstruent cluster, as shown in (15): 


(15) a. schrijv-en 
scrib-ent 
corrump-eren 
krabb-en 
wrijv-en 
bonz-en 
houd-en 
mog-en 
deug-en 


b. broed-en 
zondag-en 
trend-y 
snobb-isme 
Fredd-ie 
Bobb-ie 
leid-en 
kalv-eren 
vond-en 
hard-e 
erg-e 


‘to write’ schrif-t (-en) 
‘writer’ scrip-t (-en) 
‘to corrupt’ corrup-t (-e) 
‘to scratch’ krab-t 
‘to rub’ wrijf-t 
‘tohammer ’ _ bons-t 
‘to hold’ houd-t 
‘may’ moch-t (-en) 
‘to be good’ deug-d 
deug-d-en 
‘to hatch’ broed-s 
‘Sundays’ zondag-s 
‘trendy’ trend-s 
‘snobbery’ snob-s 
‘Freddy’ Fred-s 
‘Bobby’ Bob-s 
‘to lead’ leid-s (-man) 
‘calves’ kalf-s (-lever) 
‘found, PL’ vond-st (-en) 
‘hard, INFL’ hard-st (-e) 
‘bad’ erg-st (-e) 


[f-t] -tN ‘(hand) writing’ 
[p-t] ‘script’ 

[p-t] -tA ‘corrupt’ 
[p-t]  -t 2/3sg. present tense 
[f-t] 

[s-t] 

[(d)-t] 

[y-t] -t irregular past tense 
[y-t] -dN ‘virtue’ 
[y-d] ‘virtues’ 
[t-s] -A “‘broody’ 
[x-s] ‘on Sunday’ 
[t-s] -s PLinloans 

[p-s] 

[t-s] -S POSS 

[p-s] 

[t-s] ‘linking’-s ‘leader’ 
[f-s] ‘calf’s liver’ 
[t-s] -st N ‘findings’ 
[t-s] -st SUPERL ‘hardest’ 
[y-s] ‘worst’ 


There are two possible ways of accounting for these cases: by FINDEV > RVA, or 
by having FINDEV apply in one sweep to any sequence of obstruents ([—son],) be- 
fore a syllable boundary or ‘in a Coda’. The Coda possibility presupposes a usu- 
ally assumed two-step syllable structure procedure fully preceding FINDEV. At 
‘level 1’ the Rhyme constituent can maximally contain just a single consonant 
(Trommelen 1983, Kager & Zonneveld 1986, Zonneveld 1993, Fikkert 1998), im- 
plying that the rightmost obstruent of a cluster, and in some cases all obstruents, 
are in a post-Coda Appendix; then at ‘level 2’ restructuring takes place (Booij 


8 WIM ZONNEVELD 


1977, Nespor & Vogel 1986, Booij & Rubach 1987, Booij 1988), the result used 
as input to FINDEV. The matter of Dutch obstruent prefixes will be separately ad- 
dressed in section 9 below. 

The most complex behaviour of all affixes is exhibited by the past tense suffix. 
First of all, the suffix’s properties in neutral sonorant-final stem cases are those 
below: 


(16) Infinitive: noem-en ‘to name’ 
- past tense: noem-d-e = stem + ‘-d-e suffix’ 
- past participle:’ — ge-noem-d [—m-t] by FINDEV, 
inflected form ge-noem-d-e 


Likewise: ski-en ‘to ski’, zoen-en ‘to kiss’, meng-en ‘to mix’, wdndel-en 
‘to walk’, ddem-en ‘to breathe’, omhéin-en ‘to fence in’, tuinier-en ‘to 
practise gardening’, etc. 


The past participle involves the discontinuous affix ge-X-d. The —d element un- 
dergoes FINDEV when absolutely final (in the past participle), and is voiced when 
followed by an inflectional —e. Unexpectedly, even though the suffix is a plosive, 
it itself undergoes assimilation when the stem is obstruent-final; cf. the near- 
minimal pair in (17a). This pattern is completely systematic, as illustrated by the 
additional examples in (17b). 


(17) a. infinitives: krabb-en ‘to scratch’, klapp-en ‘to applaud’ 
- past tense: krab-d-e = stem + -d-e vs. klap-t-e = stem + -d-e 
- past participle: ge-krab-d [-p-t], inflected form ge-krab-d-e 
ge-klap-t [—p-t], inflected form ge-klap-t-e 


b. werk-en ‘to work’ werk-t-e [k-t] ge-werk-t [k-t] 
stamp-en ‘tostamp’ stamp-te [p-t] ge-stamp-t _[p-t] 
tobb-en ‘to worry’ _ tob-d-e [b-d] ge-tob-d [p-t] 
plant-en ‘to plant’ plant-t-e [(t)-t] ge-plant [(t)-t] 
wens-en ‘to wish’ wens-t-e [s-t] ge-wens-t [s-t] 
juich-en ‘to cheer’ juich-t-e [y-t] ge-juich-t [x-t] 
golv-en ‘to undulate’ golf-d-e [v-d] ge-golf-d [f-t] 
kapseiz-en ‘to capsize’ kapseis-d-e  [z-d] ge-kapseis-d [s-t] 


volg-en [y-] ‘to follow’ volg-d-e [y-d] ge-volg-d [x-t] 
voed-en ‘to feed’ voed-d-e [(d)-d]  ge-voed [(d)-t] 


Assimilating progressively, this suffix® behaves differently from all other voiced 
plosive-initial suffixes (-dom, -denum, -d, -baar). Trommelen & Zonneveld (1979) 
and Zonneveld (1982) argue that what is observed here is, uniquely, a plosive 
showing the assimilatory properties of a fricative. Their analysis assumes the fol- 
lowing steps. The underlying form of the suffix is the voiced fricative /-0/; after an 
obstruent this form assimilates progressively, by version (3b) of FRICDEV men- 
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tioning [-voice], producing /-0/; at the end of the rule-based derivations the ab- 
stract fricatives are converted into the plosives required by the language. 

In any analysis of the past tense, a separate concern is how to block FINDEV in 
this context: FINDEV (i) applies at the syllable edge, and (ii) crucially feeds the 
[—voice] version of FRICDEv (cf. (8b)) assumed for past tense assimilation. The 
derivations of a compound form such as volg-zucht ‘submissiveness’ ([y-s]) and a 
suffixed form like volg-zaam ‘docile’ (also [y-s], cf. (13)) critically rely on this 
ordering; but the past tense of the associated verb is volg-de [y-doa], not *[y-ta]. 
Trommelen and Zonneveld’s analysis blocks FINDEV by invoking pre-FINDEV 
resyllabification. They propose that inflectional paradigms of Dutch weak verbs 
contain a ‘theme vowel’, effectuating a syllabification automatically preempting 
FINDEV: /vol$y-e-6-e/. They provide independent evidence for this theme-vowel 
(which will be returned to in section 8 below); a separate rule deletes it later in the 
derivation (prior to FRICDEV, which needs adjacent obstruents). Some typical de- 
rivations’ look as follows: 


(18) / voly -e -6 -e / / juix -e -6 -e / / voly - zaam / 
Syllabif l$y i$x ly$z 
FinD Na. N.a. x 
Theme-V Del i) 0 na. 
FricD using [—voice] n.a. x 6 x s 
late rule y d x t n.a. 


Clearly this analysis is highly abstract in the sense of the ‘abstractness debate’ in 
1970’s and 1980’s phonology (see e.g. Kiparsky 1982), and hence not uncontro- 
versial. The alternative is simply a rule of progressive assimilation for the past 
tense suffix alone, as in (19) (van der Hulst & Kooij 1981, Berendsen 1983, 
1986), separately preceding the set in (1-4). 


(19) [+son, PT] — [voice] / [-voice] 


In terms of rule-based phonology, however, it might be considered a drawback 
that the formal similarity of this rule to FRICDEV is left unexpressed, i.e., an op- 
portunity to formally unify the two cases is left untaken. Moreover, such a rule’s 
use of the feature value [—voice] has been argued to be unavailable to phonology, 
as discussed in the next section. 


4. Non-linear Phonology 

Lombardi (1991, 1995) formulates a typology of voicing assimilation lan- 
guages in a non-linear Principles and Parameters (P & P) framework. Her work, 
which is central to virtually all generative-theoretical discussions on the feature 
[voice] of the past decade, basically renders the above analysis of Dutch into P & 
P. As a point of departure, consider some straightforward examples of regressive 
assimilation involving stops: 
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(20)'° voi voi voi voi Voice tier 
| | | | 
Lar Lar Lar Lar Laryngeal node 
| | | | 
strop - das bloed - bank hand — palm 
output: [b - d] [d - b] [t - p] 


Lombardi’s account uses the following components: 

- (i) a single-valued (‘privative’) feature [voice] (following a proposal by Ito & 
Mester 1986 towards incorporating this notion in UG); 

- (ii) a universal markedness convention specifying obstruents as voiceless ‘by 
default’ in the output when not [voice]; 

- (iii) a universal constraint, to the effect that [voice] be ‘licensed’ in Onsets only 
(in fact just in the context [___ [+son]],); this constraint is available to languages 
but not necessarily present in all of them, i.e. it is a parametric option; 

- (iv) Spread-[voice], also a parametric option (possibly a language-particular set- 
ting of the universal SPREAD parameter of Piggott 1988); SPREAD-voice is always 
regressive. 

Examples such as those in (20) run as follows. First, syllable-final [voice] fea- 
tures are declared void by licensing, depriving the rightmost two examples of their 
compound-internal word-final feature. Second, SPREAD-voice regressively dis- 
tributes [voice] onto the clusters of the leftmost two examples. Finally, the mark- 
edness convention specifies both members of the cluster in the third example as 
voiceless. Delinking by the licensing convention and end-of-the-line insertion of 
voicelessness is how this analysis deals with cases of final devoicing (hand = 
han{t]), for which there is no separate rule. Delinking (=Licensing) > Spreading > 
default is the natural order of the components of this analysis. 

Fricative devoicing does not follow naturally from these proposals: the analy- 
sis resorts to a language-specific rule.'’ Since [-voice] is theoretically impossible, 
the non-linear version of FRICDEV must be the equivalent of (3a), delinking li- 
censed [voice] after [son], as in (21a): 


(21) a. rule (non-linear FRICDEV): b. examples: 
[-son] [—son, +cont] / stop # verf / / rond # vaart / 
voi voi voi Voi 


Licensing, FRICDEV, and the markedness convention derive an entirely voiceless 
cluster in both cases of (21b). Lombardi (1991:51) correctly observes that non- 
linear FRICDEV “must apply before the spreading of [voice]”. 

The assumption of impossible [—voice] begs the question of the fate of the past 
tense suffix. Not unproblematically, the above analysis is more strongly geared 
towards voiceless clusters than before: both licensing and FRICDEV (in the context 
of [—son]) predict voicelessness. (22a) below shows Lombardi’s solution for the 
past tense: 
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(22) a. /krab-0-e/ b. /bloed#bank/ /huiz #baas / 
Fusion LJ ee 


voi voi voi 


Fusion merges the [voice] feature of two immediately adjacent obstruents, here 
pre-empting any delinking, whether by licensing (*juil[y]-de, cf. (18)) or by 
FRICDEV (*zor[y]-te). Although primarily intended for the past tenses, Lombardi 
(1991:46-51) points out that fusion has a beneficial side effect in compounds such 
as those in (22b) (originally the middlemost in (20)): the “rather futile devoicing 
and revoicing does not in fact take place: rather, [I submit] that [voice] specifica- 
tions that become adjacent fuse into one linked structure, which is parasitically 
licensed just like the structure created by spreading.” This can be seen as an at- 
tempt at avoiding a non-linear variety of the Duke of York gambit (recall section 
2, fn. 4). 

This side effect also has a downside, however. Cases such as rond-vaart [t-f], 
drij/v/-zand [f-s] and hui/z/-vuil [s-f] with a right-hand fricative should end up 
with voiceless clusters, cf. (5b) and (21), from underlyingly completely voiced 
ones. In order to cover these, Lombardi proposes the following (1995:58-59): “Tf 
Laryngeal Nodes fuse automatically and the linked structure escapes [[Voice]- 
Licensing], how can Progressive Assimilation apply in a form like (22f) 
[/huiz+vuil/ — [-sf-]]? The answer is that all of those forms have a compound- 
type boundary. In this form, /z/ has already devoiced before the level where 
[/vuil/] is added, as we can see from the fact that morpheme-final obstruents are 
voiceless in a compound even when the second member is vowel-initial: (27) 
huid+arts — hui[t]arts. So there is no point in the derivation of (22f) at which 
two voiced consonants come together, and so there is no Fusion.” This is having 
your cake to eat it, however: the analysis cannot simultaneously have fusion in 
compounds such as bloed-bank and hui/z/-baas avoiding Duke derivations, and no 
fusion in compounds such as hui/z/-vuil and rond-vaart assuming prior applica- 
tion of the licensing convention. Noting this, Wetzels & Mascar6é (2001:233-234), 
in a paper extensively reviewing the tenets of Lombardi’s work’, conclude that 
fusion fails, that where fusion fails there is no analysis of the Dutch past tense, 
and that the only remaining option involves a rule mentioning [—voice], i.e. a 
separate past tense assimilation rule as in (19), discrediting the privative [voice] 
hypothesis. This seems premature. Since the compound derivations are technically 
perfect without it, fusion could be formulated so as to cease to be relevant to the 
compounds, for instance as a morphologically conditioned language-specific rule, 
suggesting that considering it a universal may have been overly ambitious.” 

A small handful of analyses have been proposed in the literature as rivals to 
Lombardi’s account of the Dutch past tense. Booij (1995:61-62) maintains binary 
[+voice], and proposes that the suffix be underspecified for [voice]. Such lan- 
guage-specific three-way contrasts encoding idiosyncratic behaviour are not un- 
controversial.'* Even if accepted, however, getting the voice values right in the 
appropriate contexts is now a serious problem: his rule progressively assimilating 
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past tense -d to the final sound (rather than just obstruent) of the stem seems de- 
cidedly odd from a typological point of view. Devoicing is blocked by fusion, but 
the implications for Dutch compounds are left undiscussed. Iverson & Salmons 
(2003) propose a rule delinking underlying past tense /-d/’s complete Lar-node, 
after [-son] just as in (21a); it triggers a process finding /-d/ a new Lar-node, suc- 
ceeding to the left, -d adopting the contents of that node. Spreading is blocked be- 
cause it needs Lar-nodes, and FRICDEV is irrelevant because the suffix is not a 
fricative.'° The redeeming features of this analysis are that it lacks both fusion and 
abstractness, and maintains privative features (cf. fn. 12). The following proper- 
ties seem seriously questionable, on the other hand: the blocking of devoicing in 
the past tense is left undiscussed; the past tense delinking rule duplicates 
FRICDEV’s context; and it deriving an intermediate Lar-less obstruent can only be 
seen as a brute force step, with the sole function of enabling ‘progressive spread- 
ing’: this particular intermediate type of node structure has no other purpose in the 
phonology of the language. 

Thus the Dutch past tense remains a descriptive challenge, also in non-linear 
phonology. This aside, one of the attractive features of Lombardi’s work is the 
author’s attempt at sketching the contours of a typological theory of voicing as- 
similation languages of the world, by a small set of universal parameters.'® In 
Lombardi’s P & P classification, Dutch shares a slot with Polish and Catalan, 
which have word-final devoicing (the Licensing constraint) and Spreading; Ger- 
man has just the first, it has no Spreading. Languages which have internal assimi- 
lation but no word-final devoicing have extraprosodic final obstruents: an exam- 
ple may be Yiddish. English has voiced and voiceless obstruents in opposition to 
one another in both margins of the syllable, so it has neither Spreading nor licens- 
ing. 


5. An analysis in Optimality Theory (L) 

Lombardi (1999) develops an account of voicing assimilation within Optimal- 
ity Theory (Prince & Smolensky 1993, McCarthy & Prince 1993a,b), using uni- 
versal but violable constraints, and building on the non-linear analysis outlined in 
the previous section. Dutch is dealt with incompletely: regressive assimilation is 
covered, but fricative devoicing and past tense assimilation are no more than 
hinted at. The incompleteness is a result of the overall approach, as acknowledged 
by Lombardi. 

First, [voice] continues to be privative. Second, among the universal OT con- 
straints, the analysis contains two markedness constraints, one stating that voiced 
is marked relative to voiceless (‘lacking [voice]’), the other that assimilation is 
natural among adjacent obstruents (the OT version of Spreading): 


(23) *LAR: Do not have laryngeal features 
AGREE: Obstruent clusters agree in voicing 


DEVOICING IN DUTCH 13 


In the AGREE constraint the notion of ‘obstruent cluster’ functions as the, possibly 
preliminary, ‘domain’ of the constraint (for further discussion, see Lombardi 
(1999:272), and below). 

Third, two faithfulness constraints counteract phonological activity in a 
grammar: 


(24) IDLARYNGEAL: Consonants are faithful to underlying laryngeal specifica- 
tion. 
IdONSETLAR: Onset consonants are faithful to underlying laryngeal speci- 
fication. 


IDONSETLAR is the OT version of Onset-[voice] licensing (so again ‘onset’ im- 
plies the [___ [+son]], context). It is linked to work by Beckman (1995, 1998) and 
others who proposed similar ‘positional faithfulness’ constraints “as a way of ac- 
counting for the observation [...] that languages may maintain a distinction only in 
prominent positions and neutralize it elsewhere” (Lombardi 1999:270-271). 

A language like Dutch uses these constraints in the following manner: 


(25) _a. Final Devoicing 


/ hand / AGREE IDONSLAR *LAR IDLAR 
[hand] *! 
wz [hant] * 


b. Regressive voicing assimilation: voiced 


/ strop # das / AGREE IDONSLAR *LAR IDLAR 
strop # das *| * 
strop # tas I * 
r= strob # das itd * 


c. Regressive voicing assimilation: voiceless 


/ hand # palm / AGREE IDONSLAR *LAR IDLAR 
hand # palm *| * 
hand # balm *] ba * 
c= hant # palm * 


Thus, in this framework: 
- regressive assimilation = high ranked AGREE and IDONSLAR, 
- contrast (in onsets) = IDONSETLAR » *LAR, and 
- (final) devoicing = (IDONSETLAR ») *LAR » IDLAR. 

One of the most important aspects of Lombardi’s approach is her factorial ty- 
pology of languages, using the above constraints. Dutch belongs to the class of 
(26a) below: 
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AGREE, IDONSETLAR » *LAR » IDLAR Dutch, Polish 
= (syllable) final devoicing, regressive voicing assimilation 


AGREE, IDONSETLAR » IDLAR » *LAR Yiddish 
= no final devoicing, regressive voicing assimilation 


IDONSETLAR » *LAR » AGREE, IDLAR German 
= (syllable) final devoicing, no voicing assimilation 


IDLAR » AGREE, *LAR (ranking of IDONSETLAR irrelevant) 
English, Georgian 
= no final devoicing, no voicing assimilation 


*LAR » IDLAR, IDONSETLAR (ranking of AGREE irrelevant) 
Maori, Arabela 
= no voice distinction at all (no voiced obstruents) 


AGREE, IDLAR » *LAR » IDONSETLAR Swedish 
= no final devoicing, bidirectional spread of voiceless 


Lombardi points out that the overall constraint system'® makes an empirical pre- 
diction regarding the direction of voicing assimilation in languages (1999:287- 


290): 


(27) 


The licensing analysis of Lombardi (1991, 199[5]) goes some way towards explaining 
the dominance of laryngeal contrasts in the onset over those in the coda. However, in 
that analysis the rule of voicing assimilation still must stipulate direction. The well- 
formedness constraints alone could not prevent coda voiced obstruents from spreading 
[voice] to an onset, as this would result on the surface in the same well-formed doubly 
linked structure that results from regressive assimilation. 

Thus, the earlier analysis still needs to resort to stipulation to account for the fact 
that voicing assimilation is overwhelmingly regressive in direction. [...] In contrast, the 
present analysis predicts that when only these basic constraints are sufficiently high 
ranked to be active, only regressive assimilation will be possible. 


However, progressive assimilation is not completely ruled out: 


(28) 


[B]ecause the AGREE constraint is not inherently directional, progressive assimilation 
will still be possible, but only if higher-ranked constraints intervene to override the ef- 
fects of IDONSLAR. [...] In all languages I know of where voicing assimilation simply 
applies to all clusters with no further restrictions on environment, it is regressive. All 
the cases of progressive assimilation I have found, in contrast, have some further mor- 
phological or phonological restrictions on the context of assimilation, showing the ac- 
tion of additional constraints. 


As an illustration, she gives the example of the progressive voicing assimilation 
found in the English plural: cat-[s] vs. dog-[z]. Pre-OT Lombardi (1991:170) al- 
ready invoked the universal constraint in (29) (due to Mester & Ito 1989, referring 


back to 
1978): 


Harms 1973 — hence called Harms’s Generalization — and Greenberg 
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(29) Harms’s Generalization: 
Voiced obstruents must be closer than voiceless obstruents to the syllabic 
nucleus. 


Assuming /-z/ as the suffix’s underlying form, as is commonly done, this con- 
straint enforces /kzet-z/ becoming [keet-s]. She adds that, if the English plural ef- 
fect “were a language-particular rule, we would expect to find a language exactly 
like English except without this rule, which would have examples like *wim[pz]. 
However, such examples do not occur in any language”. Employing the same 
constraint in OT, Lombardi (1999:288-289) gives English tableaux such as those 
in (30): 


(30) a. cat-[s] in English 


/ cat - z/ HAGEN IDLAR *LAR 
cat - Z *! © 
cad - z me ia a 

m Ccat-s 


b. dog-[z] in English'® 


/ dog - z/ HAGEN IDLAR *LAR 
me dog -z id 
dok - s *| 


This example shows how progressive assimilation can be enforced by high- 
ranking an independent constraint, in this case Harms’s Generalization.”’ Notice 
at the same time, however, that — pace Lombardi — this English case is not typical 
of the ones described in (27-28). English is not a regressive assimilation language, 
so the case does not show how Harms enforces ‘progressive overriding regres- 
sive’: it shows Constraint » IDLAR rather than Constraint » IDONSETLAR in a situ- 
ation in which IDONSETLAR is immaterial. Moreover, if Harms is, as claimed, not 
violated “in any language” (Lombardi 1995:62) it is a candidate principle for in- 
clusion in the Gen component of the grammar, violations then not even being 
members of the candidate set in any language. 

Let us then, even more curious than before, turn to the form a description of 
the Dutch facts takes in this framework. Dutch is included (Lombardi 1999:289- 
290) in a small list of languages claimed to illustrate (28). It is described relatively 
succinctly as follows: “Other cases of progressive assimilation similarly show re- 
strictions to special circumstances. [...] Dutch [shows them] when the second con- 
sonant is a fricative (see Lombardi 1997 for data and analyses) [and in the] past 
tense morpheme (Lombardi 1991, [1995])”.”' Lombardi (1997) of this quote is an 
unpublished paper in which a reinterpretation of Dutch fricative devoicing is sug- 
gested in terms of the following OT constraint: 


(31) FRICATIVEDEVOICING: * [—son] [-son, +cont, voice] 
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This constraint must be high-ranked in order to enforce a progressive assimilation 


effect: 


WIM ZONNEVELD 


(32) Dutch Fricative Devoicing using (31): hand-zaam (rond-vaart, etc.) 


/hand-zaam/ | AGREE| FRICDEV | IDONSLAR *LAR IDLAR 
hant - zaam *! = = 
hand - saam *! : * ty 
hand - zaam *! 6 ie 

w- hant - saam e i 


In this tableau, FRICATIVEDEVOICING (FRICDEV) is in the lowest position it can 
take, it could also be first. When the second obstruent in the cluster is a plosive, 
FRICDEV is irrelevant and the tableaux in (25) hold. Lombardi (1997:12-13) con- 
cludes that 


(33) [c]learly more needs to be done to confirm the validity of [(31)] as a part of UG. [...] 

[It] may actually be some kind of constraint interaction effect, and if so the markedness 
of voiced fricatives is likely to be involved. Although all voiced obstruents are marked, it 
appears that voiced fricatives are more marked than voiced stops. [In the past tense s]ome 
constraints specific to this morpheme or to the root/affix distinction are presumably in- 
volved. 


In sections 8 and 9 these speculative remarks will underlie this paper’s new analy- 
sis of Dutch voice. Before that, however, we will consider two existing OT analy- 
ses of the facts by Grijzenhout and Kramer. 


6. An analysis in Optimality Theory (G & K I) 

Grijzenhout and Kramer (henceforth G & K) continue the discussion of Dutch 
where Lombardi abandons it: they aim at a full analysis of the Dutch facts (includ- 
ing some aspects of cliticization, but on this see fn. 5). In fact, G & K produce two 
different analyses in three papers of the same title. The first analysis is that of G & 
K (1998a), and it will be discussed in this section. The second analysis appears in 
two different variants in G & K (1998b, 2000), and it will be discussed in the next 
section. 

The common core of all of G & K’s analyses is depicted in tableaux (34a,b). 
Comparing (32) above, a pair of alternative constraints exactly occupies FRIC- 
DEV’s spot. 
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(34) a. Dutch regressive assimilation (G & K core proposal): strop-das (denk- 


baar, etc.) 
/strop-das/ | AGREE | ID-@-ONSET @-FINAL IDLAR *LAR 
STOPLAR DEVOICING 
strop - das I = 
strob - tas *! = - i = 
strop - tas *! is 
r= strob - das ‘g i kage 


b. Dutch fricative devoicing (G & K core proposal): hand-zaam (rond- 


vaart, etc.) 
/hand-zaam/ | AGREE | ID-®-ONSET (-FINAL IDLAR *LAR 
STOPLAR DEVOICING 
hant - zaam #3 7 * 
hand - saam 1 * 
hand - zaam *! ee 
wz hant - saam * 


All regressive assimilations follow from a STOP variant of IDONSETLAR, confined 
to the onset of prosodic Words (w’s), in combination with AGREE. Fricative De- 
voicing is enforced by FINALDEVOICING, in crucial combination with the two ear- 
lier constraints. 

The constraint details of this alternative proposal work out as follows: 


(35) a. *LAR is maintained, but two ‘relativized’ variants are added (G & K 


1998a:14): 
- No laryngeal features: at the end of syllables = SYLLABLE-FINAL DE- 
VOICING 
- No laryngeal features: at the end of words = PROSODIC WORD-FINAL 
DEVOICING 


b. The constraint IDSTEMLAR is introduced: segments of a stem are faith- 
ful to their underlying specification (G & K:1998a:25-26) 


Cc. FRICATIVE DEVOICING is abandoned because, from Lombardi’s (33), G 
& K (1998a:21) draw the conclusion that the constraint is “not very 
elegant”; its place is taken by WORD-FINAL DEVOICING in combination 
with (preceded by) the constraint in (i) below, which is proposed to be 
the “local conjunction” of (ii) and (iii) (which are therefore also in G & 
K’s UG constraint pool): 

(i) IDENTITY PROSODIC-WORD ONSET SToP [LAR] (G & K 1998a:24) 
(ii) IDENTITY SToP [LAR] (G & K 1998a:23) 
(iii) IDENTITY PROSODIC-WORD ONSET [LAR] (G & K 1998a:23). 
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Local conjunction is the procedure by which a ‘conjoined constraint’ is created on 
the basis of two independently existing constraints. The procedure was first pro- 
posed in unpublished work by Smolensky (1995), and entails that the conjoined 
constraint is violated by a candidate which violates both parent constraints to- 
gether (also see McCarthy 1999:365, and 2002a:18). We will return below to G & 
K’s application of conjunction to Dutch. Lombardi’s IDONSETLAR constraint does 
not figure in G & K tableaux, but it remains in their UG pool for reasons of facto- 
rial typology, cf. Grijzenhout (2000) and below. 

Some of G & K’s constraints rely on the introduction — next to the Syllable (o) 
— of the phonological domain of the Prosodic Word (), independently motivated 
for Dutch in work in Prosodic Phonology such as Nespor & Vogel (1986) and 
Booij (1988). These authors assume that Dutch compounds exist not only of com- 
binations of morphological words, but also of Prosodic Words, which in Prosodic 
Phonology serve as domains of phonological processes. Furthermore, a distinction 
is made between two types of affixes: some affixes are themselves w’s, whereas 
others are included in w’s. A vowel-initial suffix like plural or infinitival -en is a 
suffix of the latter type, so [voly-en] is a single o. (In itself this is not sufficient to 
block final devoicing, because this requires the substring [...l$y...], but @ also 
functions as the domain within which (re-)syllabification takes place.) Suffixes 
assumed to be separate w’s include those below, where equivalent (similarly 
structured) compounds are added for comparison (non-glossed data are taken from 
section 1): 


(36) a. [schub]-[achtig]a ‘squamous’ [goud]w-[achtig]@ ‘golden’ 
[paus]@-[dom]@ [denk]w-[baar]a [vriend]w-[schap]@ 
[werk]@-[zaam]o {hand]w-[zaam]o [volg]@-[zaam]o 

b. goud-ader zuig-nap 
lach-bui strop-das rib-fluweel 
grijp-graag rond-vaart huis-vuil 


The morpheme -achtig is the suffix primarily motivating the class of compound- 
like suffixes in the language (Booij 1977): it is vowel-initial, fails to allow syllabi- 
fication across the boundary, and triggers final devoicing in the preceding stem. 
Suffixes such as -e and -en occur inside m’s by a number of similar phonological 
criteria: they trigger resyllabification and have no final devoicing before them, 
and there are no Dutch full words with schwa as their only vowel (Zonneveld 
1983:310, Booij 1995:48). By the latter criterion the past tense suffix is also w- 
internal (G & K’s examples presuppose this assumption, but it is left un- 
mentioned). G & K tableaux for past tense examples look as below: 
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(37) a. Dutch past tense: voiceless (G & K 1998a) 


/[klap - de] @/ | AGREE | ID-@-ONSET (-FINAL IDSTEM | IDLAR 
STOPLAR DEVOICING LAR 
klap - de *! 
klab - te I ak * 
ce klap - te 
klab - de *! * 


b. Dutch past tense: voiced (G & K 1998a) 


/ [krab - de] w | AGREE | ID-®-ONSET (-FINAL IDSTEM | IDLAR 
/ STOPLAR DEVOICING LAR 
krap - de | is * 
krab - te *! * 
krap - te *| * 
wz krab - de 


The IDSTEMLAR constraint is the central constraint of this past tense progressive 
assimilation analysis. It expresses the fact that for the feature mentioned “it is 
usually more important to be faithful to featural specifications in the root than to 
featural specifications in affixes” (G &K 1998a:26). A constraint family of this 
type was first suggested by McCarthy & Prince (1995:364ff.), who — using exam- 
ples from Turkish, Sanskrit and Arabic — proposed that UG contain the metacon- 
straint ROOT-FAITH » AFFIX-FAITH (see Alderete 2003 for a recent in-depth ex- 
ample). 

This account invites a number of comments; making a distinction between 
theoretically flavoured comments and empirical ones, let us deal with them in that 
order. First, the ID-j-ONSET-STOPLAR constraint is claimed to be the result of the 
local conjunction described in (35c). But neither supplying constraint plays a role 
in the analysis, and no attempt is made to independently motivate them either. 
This considerably weakens the appeal to conjunction as the formal source of this, 
by G & K’s own admission, “crucial” constraint, giving one essentially no reason 
to prefer it to, say, FRICDEV.” Second if, as the authors propose, all Lombardian 
constraints (except FRICDEV) will stay in the UG pool, then clearly there is a strik- 
ing redundancy in the system: there are essentially two ways of dealing with final 
devoicing, to wit G & K’s ‘direct’ way, and IDONSETLAR » *LAR » IDLAR, which 
is Lombardi’s way (cf. (25a)). 

The empirical problems are as follows. First, one of the most vexing questions 
of the Dutch past tense pattern, namely ‘how to block syllable final devoicing’, is 
left undiscussed. We know G & K have o-FINAL DEVOICING next to @-FINAL DE- 
VOICING (cf. (35a), and recall cases like me[t]ley from (10)), but it is difficult to 
see how the o-variant can function in the hierarchy. Clearly in order to account for 
me[t]ley it must precede ID(STEM)LAR, but the set-up of the past tense is such that 
IDSTEMLAR » o-FINAL DEVOICING must hold, cf. (37b). The conclusion follows 
that the analysis simply fails to capture the Dutch past tense. Second, IDSTEMLAR 
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is empirically problematic in relatively run-of-the-mill cases of regressive assimi- 
lation involving w-internal suffixes starting with an obstruent, to wit -te, -sel, -ke 
and -ster from (13) in the voiceless class, and -de in the voiced class. Consider the 
tableau below: 


(38) voiceless obstruent-initial suffix (G & K 1998a prediction) 


/ [hoog - te] w / | AGREE | ID-w-ONSET @-FINAL IDSTEM IDLAR (o-FINAL 
‘height’ STOPLAR DEVOICING LAR DEVOICING) 
hoox - de *| % - 
hoog - te pal * ie 

wr hoog-de * i 

-~™ = hoox - te a! ‘“ 


These suffixes contain schwa as their only vowel, so cannot be prosodic words. 
Notice that o-FINAL DEVOICING (added separately in the tableau) could solve this 
problem if not for the interference of the past tense pattern just observed. 

The past tense suffix can be directly opposed to a suffix of the same phono- 
logical shape, with the other suffix triggering regressive assimilation. Left unana- 
lysed by G & K, this is the numeral suffix -de ‘-th’, listed in (13a), and further il- 
lustrated below: 


(39) a. sonorant-final numbers fricative-final numbers 
2: twee twee-de 5: vijf vijv-en vij[v]-de 
3: drie der-de 6: zes zess-en ze[z|-de 
4: vier vier-de 11: elf elv-en el[v]-de 
7: zeven zeven-de 12: twaalf twaalv-en twaal[v]-de 
9: negen negen-de 20: twin-tig twintig-en —_ twintig-ste 
10: tien tien-de 100: honderd honderd-en honderd-ste 
b. 21°: twee-tot-de-tien-de 5°: vijf-tot-de-ze[z]-de (cf. ze[s]) 
3°; drie-tot-de-a-de 8!: acht tot-de-e[v]-de (cf. f= e[f]) 
4™: drie-tot-de-em-de 10°: tien tot-de-e[z]-de (cf. s=e[s]) 
c. 0: nul nul-de —6: min-zes min-ze[z]-de 
Tt: pi pi-de etc. 


(39a) shows that the suffix occurs after numbers below ‘20(th)’, after which —ste 
takes over completely (‘Ist’ is eer-ste; ‘8th’ is acht-ste, possibly because the 
number ends in a plosive). These limited cases might be taken to indicate that this 
suffix is of very low productivity, and the marked case when compared to its sis- 
ter 

—ste and to fully productive past tense —de. However, although this is not com- 
monly recognized, numeral —de is very productive, too. It is used in the ‘to-the- 
power-of’ construction with more than just numbers (39b), and in fact can be 
added to anything that strikes one’s mathematical fancy, as long as this stem falls 
within its range (39c). Regressive assimilation occurs across-the-board. Thus, this 
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suffix is a clear counterexample to the G & K analysis, which claims that gener- 
ally obstruent-initial w-internal suffixes will undergo progressive assimilation; 
this claim is incorrect. In fact, the past tense suffix is the only suffix to do so. 


7. Another analysis in Optimality Theory (G&K II) 

The second G & K analysis of Dutch voicing (1998b, 2000) is partly similar to 
and partly different from the first. It is equivalent right up to the past tense, when 
a route is taken completely different from IDSTEMLAR. Consider the two new past 
tense tableaux below (ID-®-ONSET STOPLAR and @-FINAL DEVOICING omitted for 
reasons of space; these constraints are not violated by any of the candidates in 
(40): 


(40) a. Dutch past tense: voiceless (G & K I) 


/ [klap - de] w | AGREE o-FINAL | IDONSET *LAR | IpStop | IDLAR 
/ DEVOICING LAR ' LAR 
klap - de *| ' * ' 
wr klap - te * * s 
klab - de * as ad * 


b. Dutch past tense: voiced (G & K I) 


/ [krab - de] @ | AGREE o-FINAL IDONSET *LAR IDSTOP IDLAR 
/ DEVOICING | LAR | LAR 
krap - de ie! ‘i = 
krab - te *! * * = Woe i 
krap - te * ook *| 
we krab - de eS | ae | 


Compared to G & K I, many more constraints are active in ‘the second half’ of the 
analysis, including IDONSETLAR and *LAR from Lombardi’s work, and ID- 
STOPLAR from G&K’s (35cii). 

Three of the comments on the analysis of the previous section apply here, too: 
that on the conjunction underlying ID-®-ONETSTOPLAR; that on the final devoic- 
ing redundancy; and the empirical comment about general regressive assimilation 
before obstruent-initial -internal suffixes. Granting that a striking visual differ- 
ence between this analysis and the previous one resides in the dotted lines be- 
tween some of the constraints, the apparent (because undisclosed) G&K interpre- 
tation of these lines is the following. For each candidate the total number of viola- 
tions must be calculated for the entire block. In these examples this neutralizes the 
first constraint pair, and brings the second block into action: the same method 
there leads to the empirically correct result. That this calculation procedure is the 
correct interpretation of the authors’ intentions is confirmed by the way the ex- 
clamation marks indicate the point of a candidate’s failure. Such a ‘coranking’ of 
constraints (Inkelas 1999:183), however, has a standard interpretation in the litera- 
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ture which is different from G&K’s (Kager 1999:404-407, McCarthy 2002a:26- 
28, 227): coranked constraints define grammars in which the constraints are mu- 
tually ranked, according to all logical possibilities. When two constraints happen 
to be irrelevant to one another, this is without consequence; when two constraints 
are in conflict, the two orderings will give different outputs. Then, according to a 
proposal originally due to unpublished work by Kiparsky (see Anttila 1997, 
McCarthy 2002a:227) the two orderings describe (optional) variation in the output 
in a given language. Returning to (40) and considering the block of unordered 
constraints comprising o-FINALDEVOICING and IDONSETLAR, we see that neither 
situation holds: the constraints are not mutually irrelevant but they are in conflict; 
and the output does not vary but it is fixed.” 

There is a passage in the OT literature, common to Tesar & Smolensky 
(1998:249-254) and Tesar & Smolensky (2000:47-50), in which a similar proce- 
dure is discussed under the name of the “stratified hierarchy”. Stratified hierar- 
chies are the result of language learning by the process of Constraint Demotion, 
by which constraints obtain lower and lower positions in the hierarchy when the 
child is stepwise trying to conform to the properties of the target language; demo- 
tion is to a lower ‘stratum’, the evaluation properties of which Tesar and Smolen- 
sky describe as in (41a) below. However, these authors also motivate the claim 
that an adult grammar (the endpoint of the acquisition process) always be a totally 
ranked hierarchy, cf. (41b): 


(41) a. When C1 and C2 are in the same stratum, two marks *C1 and *C2 are equally weighted 
in the computation of Harmony. In effect, all constraints in a single stratum are col- 
lapsed, and treated as though they were a single constraint, for the purposes of deter- 
mining the relative Harmony of candidates. Minimal violation with respect to a stratum 
is determined by the candidate incurring the smallest sum of violations assessed by all 
constraints in the stratum. (p. 241) 

b. [T]he learning data are generated by a UG-allowed grammar, which [...] is a totally 
ranked hierarchy. When learning is successful, the learned stratified hierarchy, even if 
not totally ranked, is completely consistent with at least one total ranking. The empirical 
basis [for this claim] is the broad finding that correct typologies of adult hierarchies. 
Generally speaking, allowing constraints to have equal ranking produces empirically 
problematic constraint interactions. 

From the learnability perspective, the formal results given for E[tror] D[riven] C[on- 
straint] D[emotion] depend critically on the assumption that the target language is given 
by a totally ranked hierarchy. (p. 249) 


Thus, G & K’s use of coranked constraints, or a stratified hierarchy, is at odds 
with existing proposals in the literature, and with Tesar and Smolensky’s reasoned 
view of the ‘endstate’ of the process of first language acquisition. 

Within G & K’s second ‘coranked’ block in (40) two remarkable properties 
hold. First, the two Identity constraints have virtually identical evaluation results. 
In fact, the analysis would work equally well when it simply doubled IDLAR’s 
marks. This is relevant because this is the only place in G & K where IDSTOPLAR 
(one of the two constraints underlying G & K’s local conjunction towards ID-- 
ONSETSTOPLAR) is given a role, although it turns out that it serves to get the 
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arithmatic right by multiplying the IDLaAR effect. Even more remarkably, two 
strictly ordered pairs out of the three constraints of this final block give exactly 
the same result as adding up marks, namely IDSTOPLAR » *LAR, or — more simply 
— IDLAR » *LAR. One suspects there must be a reason for not using one of these 
latter ordered pairs (‘avoid ordering when adding up works’, perhaps), but it is not 
disclosed by the authors. 


8. Revisiting -d-e (PT) 

The conclusions from the previous three sections taken together imply that no 
accurate OT account exists of the facts of Dutch voicing. It is the purpose of the 
final two sections of this paper to formulate such an account. The point of depar- 
ture will be Lombardi’s 1999 ‘skeletal’ analysis of these facts (repeated here in 
(42) from (32)), in combination with the comments in (33) on the two areas of 
special interest: fricative devoicing and past tense progressive assimilation. 


(42) = Basic Lombardi OT analysis of Dutch 


/hand-zaam/ | AGREE | FRICDEV | IDONSLAR *LAR IDLAR 
hant - zaam *! 3 bs 
hand - saam *! * iy * 
hand - zaam *! ee * 
w- hant - saam * * 


This analysis does not hinge on a difference between w- and non-w-affixation.”* 
No stratified blocks appear in it, and it covers the Dutch facts up to the behaviour 
of the past tense suffix — albeit crudely so in so far as it incorporates the FRICDEV 
constraint whose “validity”, (33) submits, “needs to be confirmed”. The next sec- 
tion will deal with FRICDEV, essentially confirming that it is, as (33) also submits, 
“some constraint interaction effect”. The current section deals with past tense as- 
similation. The discussion’s premise is that this case’s analysis cannot be fully 
phonological, because (a) regressive assimilation is the rule both universally and 
language-specifically, as opposed to progressive assimilation; and (b) the (produc- 
tive) Dutch numeral suffix /-de/ shows re- rather than progressive assimilation. 
Given this, an approach such as the following can be envisaged. 

It is a well-known observation about Dutch verbal morphology that the verbal 
‘stem’ is phonologically surprisingly stable (Trommelen & Zonneveld 1979, Koe- 
foed 1979, Kooij 1981, Zonneveld 1982), i.e. it exhibits ‘paradigm uniformity’. 
Past tense progressive assimilation is just one manifestation of that notion, and in 
so far as it is wider, a constraint such as G & K’s IDSTEMLARPT, although inter- 
pretable as a contribution to a formal representation of the idea, is only a small 
part of the story. In fact, Trommelen and Zonneveld’s ‘theme vowel’ analysis dis- 
cussed in section 3 can be seen as an attempt at covering the broader picture in 
one ‘rule-based’ sweep. These authors’ ‘independent evidence’ for it alluded to in 
that section comprises the following: 
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(43) a. A (minor) rule of ‘open syllable lengthening’ creates alternations in 

nouns, but their related verbs have paradigm uniformity: 

bad ~ ba:den ‘bath(s)’ vs. ik ba:d ~ ba:den ‘I/to take a bath’; 

b. A rule of ‘d-weakening’ before an inflectional schwa creates alterna- 
tions in adjectives, but verbs have paradigm uniformity: 
goed ~ goe[j]-e ‘good’ vs. bloed ‘blood” ~ ik bloe[j] ~ bloe[j]-en ‘I/to 
bleed’; 

c. Given schwa-deletion before another vowel, verb stems undergo it 
across the board: 
elite ~ elit-air ‘elit(ist)’ vs. sco:re ‘score’ ~ ik sco:r ~ sco:r(e)-en ‘I/to 
score’; 

d. Given word-final post-schwa n-deletion, verb stems fail to undergo it: 
opa(n) ‘open, adj.’ vs. ik opan ~ opane(n) ‘I/to open’; 

e. Given a rule of ‘loss of final j’, verb stems have paradigm uniformity: 
vlo ~ vloo[j]-en ‘flea(s)’ vs. ik vloo[j] ~ vloo[j]-en ‘I/to flea (one an- 
other)’ 


This survey shows that the syndrome is much larger than just stem preservation in 
past tense progressive assimilation. The stability manifests itself in two types: 
overapplication of a process (a-c), and underapplication (d-e). OT has a way of 
dealing with such patterns in the form of Output-Output Identity. OO-IDENT in- 
tends to cover those cases in which a (morphologically) derived form copies a 
phonological property (or properties) of a Base, which itself is an output (Kager 
1999: Ch.6, McCarthy 1999:174-176). In (44a) below is an OO-IDENT pattern oc- 
curring in Icelandic deverbal action nouns (Benua 1995), involving — in deriva- 
tional terms — truncation (of —a) when going from infinitive to derived noun; 
(44b) gives (in a slightly condensed version) the tableau of one example, adapted 
from Kager (1999:265ff.), in which the infinitive (output) form serves as the Base. 


(44) a. Icelandic OO-IDENT pattern 
klifra [v] ‘to climb’ ~ klifr [v] ‘climbing’ 
kumra _ ‘to bleat’ ~ kumr ‘bleating’ 
puukra ‘toconceal’ ~ puukr ‘concealment’ 
siifra ‘tolament’ ~ siiffr ‘lamentation’ 
whereas elsewhere: *[—son][+son]# and *VVCC 
b. 
/ sifr-a/ - [siifra] | DA-N OO- SONORITY | *VOWEL IO-IDENT 
=o IDENTINF LENGTH 
w> siifr * ig . * 
siif ial * 
sifr **| ie * 
s(ijifr-a *| (*) 


This closely resembles the Dutch situation. Adopting the infinitive as the Base for 
OO-IDENT in this language, too — this is the form in which the underlying voice 
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value surfaces — the constraint will be placed above IDONSETLAR in the hierarchy, 
to enforce progressive assimilation: AGREE » OO-IDINF » IDONSETLAR leads to 
[krab-de] from /krab-de/ (*krap-te) and [klap-te] from /klap-de/ (*klab-de). This 
seems intuitively promising, but unfortunately there is a serious snag: referring 
back to the data in (17), this proposal works for the past tense and the inflected 
past participle, where full progressive assimilation shows up; the uninflected past 
participle, however, undergoes neutralization (final devoicing), cf. ge-kra[p-t] 
from /ge-krab-d/ and ge-kla[p-t] from /ge-klap-d/, which violates paradigm uni- 
formity. The subhierarchy of OO-IDINF » IDONSETLAR (»*LAR) fails to predict 
this: it is the correct hierarchy for progressive past tense assimilation, but *LAR » 
OO-IDINF is the one neutralization calls for. Dutch past tense paradoxes of this 
kind are by now familiar terrain, and a natural question to ask is whether there is a 
constraint or constraint interaction situation that will possibly improve upon this 
performance. The answer is yes, and involves local conjunction. Tableau (45) be- 
low gives the result of conjoining the two pivotal constraints relevant to the verbal 
cases under scrutiny, namely *LAR and OO-IDINF. 


(45) Past tense progressive assimilation and neutralization by local conjunction 
(&); infinitive base: krab (-en) ‘to scratch’, klap (-en) ‘to applaud’ 


*FLAR&O IDONSLAR *LAR OO- 
O-IDINF IDINF 
i. a. /krab-de/ we b-de 7 
p-te *| i 
b. /klap-de/ b-de *| ee = 
we p-te is 
ii. a. ge-/krab-d/ b-d Us 
mz  p-t ‘a 
b. ge-/klap-d/ b-d *| ae me 
i p-t 
iii. a. ik /krab/ b *! 
a Dp * 
b. ik /klap/ b *| * % 
rw ?P 


The tableau shows just those candidates which survive AGREE. Among those 
cases in which the conjoined constraint results in a star, just that of (ib) is crucial: 
this is where progressive assimilation is enforced. The other two are straightfor- 
ward cases of final devoicing, independently explained by IDONSETLAR » *LAR 
(» OO-IDINF). 

It is so far not so clear from these progressive assimilation vs. neutralization 
cases why crucially the analysis is couched in terms of output-output rather than 
input-output identity (as in G & K’s IDSTEM constraint). Both options seem to 
work equally well. This issue harks back to data mentioned as early as (11) of this 
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paper, which — translated into OT terms — now constitute a Richness of the Base 
argument. Some currently crucial cases are those of (46): 


(46) feest ‘party’ feest-en ‘to party’ fees(t)-te 
gesp ‘clasp’ gesp-en ‘to clasp’ gesp-te 
inkt ‘ink’ inkt-en ‘to roll ink’ ink(t)-te 
hoofd ‘head’ ont-hoo[v-d]-en ‘to behead’ onthoo[v(d)]-de 
fiets ‘bicycle’ fiets-en ‘to ride a bicycle’ fiets-te 
hypothetical kiets *kie[d-z]-en *kie[dz]-de 


As pointed out above, this pattern implies that if a stem ends in an obstruent clus- 
ter, the cluster is underlyingly voiced only if it ends in a plosive; if it ends in a 
fricative, it will always be voiceless. By OT’s assumption of Richness of the Base 
(“no language-particular restrictions on the input”, cf. Prince & Smolensky 1993, 
McCarthy 2002a:70ff.), the constraints will have to explain the latter gap. It will 
follow from the analysis in the next section that this gap is predicted by FRICDEV 
» IDONSETLAR for the infinitive (fiets-en/*kiedz-en) (‘old’ and ‘new’ FRICDEV do 
not differ here) but not for the past tense. This clearly suggests a role for output- 
output identity based on the infinitive, as shown in tableau (47). 


(47) Hypothetical verb ending in a fricative-final cluster, past tense progressive 
assimilation; underlying form /kiedz/, inf. kiet$s-en by FRICDEV » IDON- 


SETLAR 
/kiedz-de/ ~ *LAR&OO- | IDONSLAR *LAR OO-ID- IO-ID 
[kietsen] IDINF INF /-STEM/ 
kiedz-de *! iia ‘3 
wz kiets-te is . 


The tableau shows how *LAR conjunction based on output-output identity suc- 
ceeds, where input-output identity fails. As in (45), the *Lar&OO-Id-Inf conjunc- 
tion is not crucial to those forms of this hypothetical verb which are simply sub- 
ject to final devoicing; they are therefore not shown.”° 

Thus, an empirically correct conjunction analysis exists for all cases of past 
tense assimilation and neutralization. It is nevertheless presented with some reser- 
vations. Its validity partly depends on considerations concerning ‘metaconstraints’ 
on local conjunction, touched upon in the next section when focusing on Dutch 
FRICDEv (in a proposed account of which conjunction again plays a vital role). 


9. Fricative Devoicing as (multiple) local conjunction 

The remaining empirical question of this paper concerns the Dutch fricative 
devoicing phenomenon. The proposal put forward here again uses conjunction as 
the core mechanism of the analysis. The constraints involved are well-known 
*LAR, and possibly less familiar *ONSETFRIC, banning fricatives from occurring 
in onsets. *LAR will first undergo self-conjunction, to derive a constraint called 
LYMAN’ SLAw in recent work by Ito & Mester (1998, 2002, 2003) on Japanese. In 
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this language, interestingly, stems cannot contain more than one voiced obstruent, 
as shown by examples such as those in (48) (2002:33-36): 


(48) kaki ‘persimmon’ kagi ‘key’ gaki ‘kid’ *9 agi 
toku ‘solve’ togu ‘sharpen’ buta ‘pig’ *buda 


Ito and Mester’s proposal is to take out the co-occurrences of two voiced obstru- 
ents by a ‘self-conjoined’ version of *LAR.”° The forms kagi and gaki violate this 
conjoined constraint once, but it is the double violation in gagi that represents “the 
worst of the worst” (McCarthy 2002a:18). LYMAN’SLAw therefore takes the fol- 
lowing form: 


(49) LYMAN’sLAw: No co-occurrence of voiced obstruency with itself 
(domain: the morpheme). 


The current proposal is that the reason Dutch speakers reject *ron[d-v]Jaart is 
closely related to why Japanese speakers reject *gagi. Note that LYMAN’SLAW 
contains a domain statement, which cannot be the one relevant to Dutch. The 
Dutch domain is that of the obstruent cluster, which we know to be an independ- 
ently recognized domain in the AGREE constraint (recall discussion in section 5). 

*ONSETFRIC is first and foremost prominent in the literature on child speech. 
A survey of pre-OT child speech sources discussing phenomena motivating its 
existence can be found in Zonneveld (1999): Leopold (1947), Menn (1971), 
Gierut (1985), Chiat (1989), Fikkert (1994), and others. These phenomena include 
deletion (as in [is”] for fish), stopping (as in [tua] for Schuhe ‘shoes’ in German), 
h-ization as in [hoy] for vork ‘fork’ in Dutch), metathesis (as in [uz] for zoo, and 
[nouz] for snow in English), and so on. The constraint’s effects have been less 
frequently reported for adult languages, but it is not impossible to find potential 
cases of its applicability. Keren Rice (pers. comm.) suggests a case in Athapaskan 
by which t + x > k: “a constraint like *FRICONSET could be used to choose the 
right manner of articulation. There are, of course, fricative onsets, but if there is a 
choice between a fricative and a stop, the stop will always prevail”. Roelandts 
(1962), citing Boyd-Bowman (1955), gives examples of Latin American Spanish 
hypocoristics such as those below involving fricative > p: 


(50) Bonifacio > Pacho Flora > Poya Josefa > Pepa, Pita 
Delfina > Pina Francisco > Paco Ofelia > Pela 
Felipe > Pil José > Pepe Serafina > Pina 


Based on just a small handful of data, the conclusion might be drawn that at some 
stage (Getxo) Basque treated loanwords from Spanish in a similar manner 
(Hualde & Bilbao 1992:3); informal sources suggest that Uyghur, too, treated bor- 
rowings from Arabic and Persian like this. A possible phonetic explanation under- 
lying the constraint might be that the aperiodic sound source of fricatives is not 
very salient prevocalically, making at least plosives better onsets than fricatives. 
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Bernhardt & Stemberger (1998:432) offer the explanation that in addition to 
[—continuant] being a default value, fricatives can appear in codas earlier than in 
onsets because continuancy is preferred throughout rhymes, “grounded in the fact 
that vowels must be made with an open vowel tract”. Another application can be 
found in Gnanadesikan (1995/2004), Pater (1997) and Pater & Barlow (2003), 
who propose the following ‘fixed ranking’ (in Pater’s version) with the function 
of relating the ‘sonority hierarchy’ to syllable position: 


(51) *GL-ONSET » *LIQ-ONSET » *NAS-ONSET » *FRIC-ONSET 


Although fixed, this ranking can be interrupted in a given analysis by conflicting 
constraints. The ranking’s constraint of current interest is the final one. Assuming 
that it is available from UG, tableau (52) shows how the fricative devoicing effect 
can be derived, when &-FRICDEV is the conjunction of LYMAN’SLAW and 
*ONSETFRIC: 


(52) Fricative Devoicing as local conjunction 


/ hand - zaam / AGREE | &-FRICDEV | IDONSLAR | LYMAN’S *LAR *ONSET 
LAw FRIC 
hand - saam I * * * i 
hant - zaam *! mo + 
hand - zaam ef * ey 
wz hant - saam i * 


Comparing (42), tableau (52) implies a slightly different but still effective evalua- 
tion of the candidates. In the full Dutch hierarchy, &-FRICDEv is immediately fol- 
lowed by the usual constraints enforcing regressive assimilation and final devoic- 
ing, one of which is *LAR contributing to LYMAN’SLAw. Obviously, *ONSETFRIC 
will be (very) low-ranked because obviously Dutch has fricatives in onsets. 

The proposed setup conforms to Lombardi’s idea that generally apparent pro- 
gressive assimilation ought to be due to the interference of high-ranked independ- 
ent constraints. A number of constraints enforcing similar effects are much less 
plausibly invoked: conjoining simple *LAR rather than LYMAN’SLAW makes the 
incorrect prediction that Dutch has no initial voiced fricatives; *LARFRIC (Al- 
derete 2003) cannot replace *ONSETFRIC because Dutch allows internal clusters 
such as [zb], cf. huisbaas (5a) and asbest (11a). On the other hand, the hierarchy 
in (51) adequately captures the skewed distribution of these latter examples’ mir- 
ror image clusters: recall fatsoen ~ *fa[dzloen from (11a), and fiets ~ fiets-en ~ 
*fie[dz]-en from (11b), also discussed in the previous section. Further empirical 
support will be presented immediately below. Notice also, though, that if 
*ONSETFRIC is accepted as a vital component of the proposals put forward here, 
this implies the addition of a constraint expressing ‘positional markedness’ into an 
analysis that relies on ‘positional faithfulness’ for the core voicing phenomena in 
natural languages (recall section 5). The introduction to this paper referred to 
Lombardi (2001) and Alderete (2003) (analysing Navajo) to the effect that such a 
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situation cannot and should not be excluded. Regarding multiple local conjunction 
resulting in &-FRICDEV, to demonstrate the beneficial effects of this mechanism is 
one of the aims of Ito & Mester’s (2003) analysis of a variety of processes in 
German, and the analysis proposed here gratefully adopts this device. One of its 
attractive characteristics is that, pending further investigation, it appears to ac- 
count well for the infrequency of the fricative devoicing pattern among the lan- 
guages of the world: it is the cumulative effect of conjoining C1 and C2, where 
C1 is the result of self-conjunction. Embedded self-conjunction is probably not a 
frequent situation, and this may go some way towards curtailing potentially ad- 
verse factorial-typological implications. 

A small handful of remarks wind up this discussion, some of them empirical, 
and some theoretical regarding ‘metaconstraints’ on conjunction as proposed in 
the recent OT literature. Because of the differences in how they assess candidates, 
&-FRICDEV is empirically more accurate than ‘old’ FRICDEV was, in two ways. 
The data in (53) concern Dutch assimilating clusters larger than two obstruents: 


(53) a.”” rups-band ‘caterpillar track’ —ru[bz-b]and  - by &-FD, * by OLD-FD 
rups-en (pl.) *ru[ps-pland - by &-FD, ¥ by OLD-FD 


- sim.: aarts-dief ‘errant thief’, boks-beugel ‘knuckle dusters’, ex-dokter 
‘former doctor’, fiets-band ‘bicycle tire’, flits-blokje ‘flash cube’, gips- 
beeld ‘plaster figure’, kaats-bal ‘rubber fives ball’, loods-dienst ‘pilo- 
tage’, rots-blok ‘boulder’, sex-bom ‘sex-bomb’, etc. 


b. hoofd-zaak ‘essentials’ *hoo[lvd-zjaak * by &-FD, * by OLD-FD 
hoo[v]d-en (pl.) hoo[ft-sjaak by &-FD, V by OLD-FD 


- sim.: jeugd-vriend ‘friend from way back’, deugd-zaam ‘virtuous’, 
smaragd-groen ‘emerald green’, etc. 


The cases of interest are those in (53a) in which regressive assimilation and frica- 
tive devoicing partly overlap, the former prevailing. Old FRICDEV makes an incor- 
rect prediction here. &-FRICDEV passes the choice on, simply because there is no 
violating onset fricative; then, IDONSETLAR will make the correct choice (cf. 
(52)), routinely selecting regressive assimilation. 

Next, the data in (54) contain obstruent clusters in onsets, not present in any of 
the examples discussed so far, unless by accident. 


(54) a. stal ‘stables’ studie ‘studies’ straat ‘street’ 
stoom ‘steam’ station ‘station’ streek ‘region’ 
spin ‘spider’ spektakel ‘spectacle’ spraak ‘speech’ 
spoor ‘trace’ specerij ‘spice’ splijten ‘to split’ 
ski ‘ski’ score ‘score’ sclerose ‘sclerosis’ 
schaar ‘scissors’ schommel ‘swing’ schrijven ‘to write’ 


sfeer ‘atmosphere’ sfynx ‘sphynx’ sfincter ‘sphincter’ 
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b. tsaar ‘czar’ tseetsee ‘tsetse fly’ tsunami ‘tsunami’ 
psalm ‘psalm’ psyche ‘psyche’ psoriasis ‘psoriasis’ 
xenon ‘xenon’ xylofoon ‘xylophone’ Xantippe ‘Xantippe’ 

c. ftisis ‘phthisis’ ftaal ‘naphthalene’ _— Pfeiffer 


‘glandular fever’ 


d. ptyalase ‘ptyalin’ pterosaurus ‘pterosaur’ 
Ptolemaeus ‘Ptolemy’ 


These examples are just a handful of members of the large class of obstruent clus- 
ter-initial words in which the peripheral obstruent is (virtually always) s-. Gener- 
ally speaking, a Dutch complex onset has a ‘Germanic’ shape: roughly Obstru- 
ent(-Liquid), optionally preceded by an s- (Trommelen 1983, Fikkert 1998). Once 
the peripheral element is recognized to be a fricative, a prediction is derived 
within the current framework: since the onsets violate *ONSETFRIC, they will 
automatically be voiceless, by AGREE and &-FRICDEV. This is exactly right: onset 
obstruent clusters are voiceless in Dutch.” In other (previous) frameworks this is 
often a separate ‘morpheme structure’ condition (Zonneveld 1983, 1994), here it 
simply follows from the analysis. Also observe that FRICDEV does not predict this 
pattern, since in most cases the fricative is not in right-edge cluster position. Each 
step down in (54) implies lower frequency/familiarity for the cluster type in- 
volved. The data in (54b) have non-peripheral s, those in (54c) have f. The analy- 
sis correctly predicts voicelessness. In (54d) a different area is entered, namely the 
prediction that full-plosive clusters will generally be ‘faithful’ (giving voice a 
chance to surface) vis-d-vis actual native speaker behaviour. Differently from 
English, where cluster reduction seems a strongly preferred strategy, Dutch 
speakers apply consonant-faith and often insert a schwa-like element into the clus- 
ter: [potesosaurus]. This process has wider application, generally affecting any 
‘difficult’ cluster in loans: G[a]dansk, G[a]staad, T[a]blisi, N[o]guyen, [o]Ng[ce], 
[e]Mbeki, [o]Ndour, and so on.”? 

Following Lombardi, the current analysis avoids sanctioning (forms of) final 
devoicing formulated as ‘positional markedness’. While PM is proposed to coex- 
ist with PF, it should be noted that PM FINDEv (or *CODAVOICE) is suspect not 
just because of the G & K redundancy noted in section 6. Lombardi (2001) ob- 
serves that languages shunning final voiced obstruents have a priorily available a 
number of different strategies: devoicing the obstruents, but also deleting them, or 
resyllabifying them by inserting a final vowel. Interestingly, languages never 
seem to employ the latter two strategies. If this is true, this is — she submits — an 
inexplicable factorial-typological gap under the assumption that FINDEV is a UG 
constraint, because the absent strategies could be triggered by a high ranking of 
FINDEV » FAITH. The empirical gap is explained, however, by her core constraint 
set (section 4 of this paper) when IDLAR is replaced with MAXLAR, in itself a 
theoretically interesting move (briefly: given /pid/, universally a candidate [pidi] 
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is always a fatal violation of *LAR and ‘no insertion’, and [pi] always one of 
MAXLAR and ‘no deletion’). 

“Lombardi’s conundrum” (McCarthy 2002b:286) just outlined underlies the 
PF approach towards Dutch voicing adopted in this paper. At the same time, how- 
ever, Ito & Mester (1998, 2002, 2003) use FINDEV in a way that suggests that 
stopping it from appearing in grammars may be easier said than done: they pro- 
pose that FINDEv be the conjoined offspring of the two independently motivated 
UG constraints *CODA and *LAR. In response, Fukazawa & Lombardi (2003) try 
to develop proper criteria (metaconstraints) for local conjunction. One of their 
suggestions is that the conjunction of a structural constraint (*CODA) and a mark- 
edness constraint (*LAR) be prohibited, banning FINDEV. In fact, the strong ver- 
sion of their proposal is “that only constraints from the same constraint family can 
be adjoined” (p. 204). Obviously, in the current context self-conjoined LYMAN’S 
LAw falls within the range of this proposal, as does the conjunction of LYMAN’S 
LAw and *ONSETFRIC: these are all members of the family of markedness con- 
straints. The conjunction analysis of the Dutch past tense falls outside it, involving 
as it does a markedness constraint and an (OO-) faithfulness constraint. However, 
Fukazawa & Lombardi (2003:206) cite a number of recent Markedness & Faith- 
fulness conjunction cases as potential counterexamples to their own claim (Lubo- 
wicz 2002, Ito & Mester 2003). It does not seem very farfetched, therefore, to as- 
sume that for a reason currently less than perfectly understood Markedness & 
Faithfulness conjunction is allowed, but Markedness & Structure prohibited. If so, 
this paper can be taken to show that both the Dutch past tense progressive assimi- 
lation and the fricative devoicing phenomenon are cases of (or follow from) the 
conjunctions of independently motivated UG constraints. 
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Notes 

1. This notion of historical sketch should not be confused with that of the vademecum. The pur- 
poses of this contribution are as just stated. The author is aware that Dutch voicing phenomena 
have been, and could be further, analysed in a variety of frameworks left undiscussed here. Since 
the study of [voice] is in no small measure a typological one, properties of voicing in other lan- 
guages than Dutch will bear on the proposed account. To show that, and the extent to which, this is 
so, is a matter of ongoing research. Published references to other languages that immediately bear 
on the discussion have been included in this paper (most often in the footnotes), but also here full 
coverage was not the aim and cannot be guaranteed. 


2. Phonetic symbols are used here when Dutch spelling needs clarification; see Booij (1995) and 
Heemskerk & Zonneveld (2000) for rules of Dutch spelling-pronunciation correspondence. (Note 
that geminate consonants are a spelling convention, indicating that the preceding vowel is lax — the 
consonants are always pronounced as singletons. Among the letters for velar obstruents ch = [x] 
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(lach below), g = [y] initially (grijp below) and medially (zorgen) but [x] finally (zorg); g stands 
for [g] in recent loans from a variety of languages, such as angstgegner, baguette, gangster, gou- 
lash, hooligan, jungle, reggae, gig. See also fn. 3. 


3. Initial g (= [g]) in loans (see fn. 2) is not usually included in surveys like this, because com- 
pounds containing such loans as members sound slightly artificial; yet, for this author examples 
like top/nep-gangster [b-g] ‘top/mock gangster’, and proef-gig [v-g] ‘try-out gig’, seem to gladly 
follow the prediction suggested by their place in the system. 


4. The properties of the Dutch voicing rule set were given as an example of the ‘Duke of York 
gambit’ in Pullum (1976), i.e. the rules sometimes derive an output identical to the input through a 
different intermediate stage: X>Y>X, see e.g. /bloed-bank/ in (7). Whether such derivations must 
be avoided or not has since been a point of debate. Pullum (p. 100) came to the conclusion that “a 
thorough investigation [...] does not reveal any basis for a general constraint that would prohibit” 
the gambit. More recently, claiming the undesirability of a gambit analysis, Halle & Idsardi 
(1997:344-345) analysed the English ‘intrusive r’ phenomenon using a version of the Elsewhere 
Condition as a blocking device on rule application. This issue will briefly return below. 


5. Some of the literature (Booij 1995, Grijzenhout and Kramer) discusses the voice behaviour of 
‘clitics’ (articles, some adverbs, special forms of personal pronouns). The position taken here is 
that this is an underresearched area, about which most claims are premature. As just one example, 
consider Booij’s case of [von$t-ik] ‘found-I’ (next to vond-en ‘found, PL’), which he analyses as a 
case of opaque rule interaction, in which FINDEV unexpectedly precedes syllabification. For many 
speakers this form is accompanied by voiced mirror images such as [la$d-ik] ‘leave-I’ and [mu$d- 
ik] ‘must-I’, from lat-en ‘to leave’ and moet-en ‘must’ — suggesting that the issue is more complex. 
See Zonneveld (1982) for some related unexpected patterns involving ‘clitic voice’. 


6. It would not have been unreasonable to include the diminutive suffix —(d)je in this list, but it is 
omitted because of its complex allomorphy, cf. Trommelen (1983). 


7. -de is the singular inflection form; the plural takes an additional -n: (ge-)noem-de-n, etc. 


8. Emestus & Baayen (2003) discuss an experiment using nonsense verb stems which in the pre- 
sent singular end in a voiceless obstruent, in which native speakers produce assimilated past tenses 
which “reflect correlations between final rhymes of [verb stems] and the underlying [voice] speci- 
fication of the final obstruents” as existing in a large database. Thus, for instance, nonsense kijns 
gives 30% kijns-te (70% kijns-de), whereas taars gives 76% taars-te (24% taars-de). Intriguing as 
they may be, these results are not followed-up upon here. 


9. An implicit assumption of the analysis is that renewed syllabification follows a relevant rule 
such as theme vowel deletion. 


10. Below, for brevity’s sake the Laryngeal node will only be included when relevant. 


11. This is possible even in P & P, assuming for instance a distinction such as that between ‘core’ 
and ‘periphery’ as in Piggott (1988), following Chomsky (1982) (but then see Chomsky cited in 
Strozer 1994:159). 


12. Wetzels and Mascaré’s (2001) paper contains an elaborate and critical discussion of 
Lombardi’s work. Among other things, they argue for a binary, i.e. non-privative, feature [+voice], 
on the basis of languages in which [—voice] appears to be active, such as Dutch (but see immedi- 
ately below), Yorkshire English, Parisian French, and Bakairi and Ya:the, both indigenous lan- 
guages of Brazil. Literature making the same point includes Rubach (1996) on Polish, Inkelas, 
Orgun & Zoll (1997) on Turkish, and Kramer (2000) on Ile de Groix Breton. 

Lombardi (1996) reviews some of the then available cases, and concludes that the privative 
[voice] hypothesis holds at the lexical level because (p. 32) “[a]ll of the rules that require negative 
values of these features pass the test for being postlexical rules”. From this point of view, the in- 
terest of the Dutch patterns resides in the interaction between lexical and postlexical patterns, 
mixed with the additional ingredient of real or apparent progressive assimilation. 
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Iverson & Salmons (2003) argue that the examples put forward by Wetzels and Mascaro lend 
themselves to privative reanalyses using a more sophisticated set of laryngeal features, following 
Halle & Stevens (1971), and Avery & Idsardi’s (2001) ‘dimension theory’ of the laryngeal articu- 
lator. Their account of Dutch is a straightforward translation of Lombardi’s P & P analysis into 
their framework; their proposal on the past tense will be briefly discussed below. 


13. Or possibly a morphologically conditioned parameter-setting, a situation not unheard of in 
phonology, compare in metrical phonology English extrametricality being conditioned by the 
noun-verb distinction, see Hayes (1982). 


14. See e.g. Mohanan (1991) vs. Inkelas, Orgun & Zoll (1997). 


15. The authors make a point of claiming (p.13) that Fricative Devoicing is not “progressive as- 
similation ... at all” but “post-obstruent fricative neutralization”; they overlook the fact that the 
possibility of this view has been around in accounts of Dutch voice at least since Trommelen & 
Zonneveld (1979) and Zonneveld (1983) (cf. (3a) of this paper), and was adopted by Lombardi 
(1991). 


16. Ralph (1973) may be seen as a very early precursor of this approach. 
17. English may have extrametrical consonants for reasons of word stress, again see Hayes (1982). 


18. Lombardi (1999:287) observes that in this typology Swedish has “the only known pattern that 
requires IDLAR to be ranked above IDONSETLAR”. For comments on this point and further details 
of Swedish, see Helgason & Ringen (ms.), and Petrova et al. (ms.). The latter paper also discusses 
Russian, Hungarian and German. Other papers addressing the languages and/or some of the major 
tenets of Lombardi’s work are, for instance: Brockhaus (1995) on German, Rubach (1996) on Pol- 
ish, Iverson & Salmons (1999) on English and German, Wetzels & Mascaré (2001) on Yiddish 
and Polish, and Jessen & Ringen (2002) on German. In-depth analyses of laryngeal phenomena in 
Athapaskan (Navajo) can be found in Rice (1994) and Alderete (2003). 


19. The evaluation adheres to Lombardi’s proposal (1999:295-297) that IDLAR assess configura- 
tions rather than the properties of single segments; it therefore assigns just a single mark to a viola- 
tion, as in the case of *dok-s in tableau (30b). 


20. Harms is also claimed to be active in Polish onsets (where the mirror image situation to that of 
English obtains), cf. Lombardi (1995:59-64). 

Replying to a referee’s comment, Lombardi (1995:62) acknowledges that this constraint men- 
tions “voiceless” and — given current understanding — would be “difficult to state without 
[-voice]” (also see Mohanan 1991:314). She adds, however, that such a formulation would be “to- 
tally unexplanatory”: it is not a problem introduced by the hypothesis of privative [voice], but 
seems “to fall into a class of deeper problems for phonological theory”. 


21. The other examples are: Yiddish with only one suffix, Polish [r], Athapaskan in cases of a 
“prefix-stem boundary only”, and “progressive devoicing of voiced obstruent-initial suffixes” in 
Turkish. 


22. Technically, ID-®-ONSETSTOPLAR cannot be the conjunction of the two proposed supplying 
constraints because desired output han[t-s]aam for underlying han/d-z/aam is precisely the candi- 
date violating both constraints, the conjunction of which then rejects rather than selects it. Pre- 
sumably, the proposed conjunction should be made active just in the domain of the onset. 


23. A striking difference between G & K (1998b) and (2000) is the absence from the latter paper 
of the first ‘coranked’ block. This is empirically feasible because in the authors’ view the con- 
straints cancel one another out, but it is infelicitous because it is the task of this block to explain 
‘how final devoicing is blocked in the past tense’; this issue now remains unaddressed in the only 
regularly published version of G & K’s analyses. 


24. This is not to deny that these domains can be relevant to other phonological areas of Dutch, 
such as the syllabification behaviour of the suffix —achtig discussed in section 5. 
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25. McCarthy (1999:385) proposes OO-IDENT, using the devoiced sg. past tense [vont] as the 
Base, as a solution to the [von$t-ik] clitic case of fn. 5; the comments in the second half of the lat- 
ter footnote remain applicable, however. 


26. Similarly, Alderete (1997) applies ‘self-conjunction’ to the phenomenon of dissimilation; also 
see further cases in Pater (2001) and discussion in McCarthy (2002a:18-19, 43). 


27. Voicing harmony seems harder to determine in these long clusters, but one indication is that 
assimilation and degemination occur when the final two consonants differ just in voice, ka[z-dJeur 
being ambiguous between kas-deur ‘greenhouse door’ and kast-deur ‘cupboard door’, rij[z-d]iner 
between reis-diner ‘travel dinner’ and rijst-diner ‘rice dinner’, and so[v-d]rugs between sof-drugs 
‘failed drugs’ and soft-drugs. 


28. Recall that we are working with ‘resyllabified’ structure, at which stricter ‘earlier-level’ sylla- 
ble structure is no longer applicable, so s- is not ‘extrasyllabic’ any longer; to this extent the analy- 
sis relies on an as yet to be developed OT theory of Dutch syllable structure. 


29. Cluster reduction is sometimes found, however, and also weakening of one of the members. 
Cf. Dvorak [(d)v6rak], Danzig (as an alternative to Gdansk), [3]on for English John, [djlazz for 
jazz, a spelling pronunciation such as [j]ack for jack ‘jacket’ (similarly [jJerrycan, [j]umper), 
S[w]erdlowsk, and so on. The treatment of such words in Dutch is certainly an area worth further 
investigation. For comments on similar clusters in English see Iverson & Salmons (1999), David- 
son (2003), and Davidson et al. (2004); for German, see Wetzels & Mascaré (2001:215). 

Wetzels & Mascar6 (2001:211) also claim that “word-initial clusters are always non-derived 
in Dutch”. Taken literally this is incorrect, although the existing examples are certainly less than 
fully impressive. Pairs like dom ‘ignorant/stom ‘stupid’, duwen ‘to push’/stuwen ‘to force’, nu- 
meral —de/ste (section 5), and a small handful of others (Zonneveld 1983:309), may illustrate a rare 
and certainly unproductive s-prefix, confirming the current analysis. A discontinuous temporal 
affix (deriving adverbs from nouns) appears in a limited number of cases such as s-maandag-s ‘on 
Monday’, and s-ochtend-s ‘in the morning’; initial s- is left unexpressed before obstruents, but 
curiously leaves ‘fricative devoicing’ as a trace: dinsdag-s ‘on Tuesday’ vs. vrijdag/[f]rijdag-s ‘on 
Friday’. (In Optimality Theory this invites involving Sympathy, cf. McCarthy’s 1999:331-337 
similar case in Tiberian Hebrew.) A discontinuous numeral affix appears in t-ach(t)-tig ‘80’; initial 
t- is left unexpressed before consonants, but leaves ‘fricative devoicing’ as a trace: negen-tig 
‘ninety’ vs. zeven/[s]even-tig ‘seventy’ (Van Loey/Schénfeld 1959:153). The relevance of these 
cases will depend on one’s desire to include or not such unproductive examples in an analysis of 
Modern Dutch. (Although especially the second affix may have been productive at a relatively 
recent stage, in combination with active fricative devoicing.) 
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We consider two theories of laryngeal representation, one using a single feature 
[voice] generalizing across prevoicing languages and aspiration languages, and the 
other using multiple features: [voice] for pre-voicing languages and [spread glottis] 
for aspiration languages. We derive predictions for children’s early productions, and 
test these for three Germanic languages. Children acquiring Dutch, a prevoicing lan- 
guage, show de-voicing of stops, while available data from German, an aspiration 
language, show de-aspiration. Although the difference might simply reflect intrinsic 
properties of children’s early production and perception systems, we argue that a 
representational account is in order, based on multiples features. The case is made 
for English, an aspiration language, based on the early productions of a single child. 
A laryngeal harmony pattern is found which spreads voicelessness from coda to on- 
set, which is argued to involve activity of [spread glottis]. This is interpreted as evi- 
dence for a laryngeal representation involving multiple features. 


1. Introduction 

A great deal of recent research has addressed the representation of laryngeal 
features (Avery 1996, Avery & Idsardi 2001, Iverson & Salmons 1995, 2003, Jes- 
sen 1989, 1996, Jessen & Ringen 2002, Lombardi 1991, 1995, 1999, 2001, Sal- 
mons & Iverson 2003, Steriade 1995, 1997, Van Rooy & Wissing 2001, Vaux 
1998, Wetzels & Mascaré 2001). Debates in the literature focus on several major 
aspects of phonological theory, including monovalent versus bivalent features, 
gradient versus categorical processes, and phonetic detail in phonological repre- 
sentation. In discussions of these issues, evidence from a range of sources has 
been used, including phonetics, phonology, typology, and diachrony. Surprisingly, 
very little evidence from acquisition has been brought to bear on the issue of la- 
ryngeal representation. 

Acquiring the laryngeal phonology of a language amounts to identifying the 
relevant contrasts, building up a representation of laryngeal features, and learning 
to produce these contrasts in an adult-like fashion. By studying children’s devel- 
oping language systems, we can gain insight into how laryngeal features are rep- 
resented. Acquisition patterns provide a way to test claims about the representa- 
tions of laryngeal features. This paper presents corpus analyses of the acquisition 
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of initial obstruents in Dutch and German, and the acquisition of initial and final 
obstruents in English. Children’s productions of voiced and voiceless obstruents 
were analysed for realizations of laryngeal features. These analyses revealed a 
number of interesting error patterns. 

First, productions of children acquiring Dutch, which is a so-called prevoicing 
language, differ from productions of children acquiring German and English, 
which are aspiration languages. Production errors of children learning Dutch tend 
to favour voiceless stops in initial position, whereas children learning German 
(Grijzenhout & Joppen-Hellwig 2002) and English (Menn 1971, Smith 1973) ex- 
hibit the opposite error pattern, producing more voiced stops. This confirms ear- 
lier results from acquisition studies of prevoicing languages such as Spanish and 
Hindi (Macken & Barton 1980b, Davis 1995) and of aspiration languages such as 
English and German (Macken & Barton 1980a). 

Second, the frequency of voicing in children’s targets that children attempt to 
produce does not reflect the error patterns related to voicing. For example, while 
production errors of young children acquiring Dutch show a trend toward initial 
voiceless stops (/b/ — [p] and /d/ — [t]), a statistical analysis of children’s targets 
reveals rather the reverse trend: children attempt more voiced than voiceless 
word-initial stops. This finding implies that children’s error patterns reflect factors 
other than frequency of targets in children’s productions, such as articulatory fac- 
tors or featural representation. 

Lastly, children’s voicing errors in English turn out to be conditioned by the 
laryngeal specification of segments occurring later in the word; more specifically, 
the devoicing of initial stops is triggered by a following voiceless obstruent. How- 
ever, no ‘harmonic’ effect is found for the voicing of initial obstruents when fol- 
lowed by voiced obstruents. 

These acquisition data were used to test different theories’ claims of how la- 
ryngeal features should be represented in languages that display a two-way laryn- 
geal contrast: either with a single binary feature [+voice] (Wetzels & Mascar6 
2001), or multiple language-dependent features, specifically monovalent [voice] 
and [spread glottis] (Iverson & Salmons 1995).' Results from this study will be 
argued to support Iverson & Salmons’ theory, in which aspiration languages (in- 
cluding English and German) use the feature [spread glottis], while prevoicing 
languages (including Dutch) use [voice]. 

Importantly, our study supports a multiple feature view, under which lan- 
guages use one constant active feature to represent their laryngeal contrasts in all 
positions, initial and final. The harmony pattern observed in English acquisition 
data supports this view: interaction between initial and final obstruents implies a 
shared monovalent featural representation for voicing in these positions, which 
abstracts from specific phonetic realization. A purely phonetic account cannot 
readily account for this pattern. 

This paper is organized as follows. Section 2 will discuss the major theories of 
laryngeal representation to be considered, stating predictions they make for error 
patterns in children’s productions in English, Dutch and German. In section 3, we 
will discuss Dutch acquisition data, which we will compare with German data in 
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section 4. Then we will find that children acquiring Dutch, a prevoicing language, 
produce errors involving devoicing of stops that are voiced in the target word, 
whereas the available German data show rather the reverse pattern, in which con- 
sonants that are voiceless in target words are produced as ‘voiced’ or more accu- 
rately, as plain unaspirated stops. In comparing the acquisition data from Dutch, a 
prevoicing language, and German, an aspiration language, two major interpreta- 
tions will be considered: a phonetic one, based on intrinsic properties of children’s 
early production and perception systems, and a phonological one, based on the 
Multiple Feature Hypothesis. In section 5, these hypotheses will be tested on Eng- 
lish, using a corpus-based study of the early productions of a single child, whose 
laryngeal error patterns will be discussed in detail. We will argue that error pat- 
terns involve the activity of [spread glottis] in a laryngeal harmony pattern affect- 
ing only voiceless consonants in coda and onset of a word. This will be inter- 
preted as evidence for a representation of laryngeal contrasts involving multiple 
features, [voice] (for prevoicing languages) and [spread glottis] (for aspiration 
languages). Finally, we will discuss consequences of our findings in section 6. 


2. Theories of laryngeal representation 


2.1 One versus multiple features 

The phonetic realization of laryngeal contrasts’ varies across languages. The 
main acoustic cue associated with voicing is voice onset time or VOT, which re- 
fers to the time between a segment’s release and the beginning of vocal cord vi- 
bration. There are a number of ways in which laryngeal contrasts are realized in 
languages (Cho & Ladefoged 1999). In the languages studied in this paper, a two- 
way contrast is employed. However, there are also languages that employ a six- 
way contrast, as for example Beja (Cushitic) and Igbo (Kwa) (Ladefoged 1973, 
cited in Iverson & Salmons 1995:382). For the purpose of this paper, however, it 
is important to note the VOT differences in stops’ across Dutch, German and Eng- 
lish. These values (based on Lisker & Abramson 1964, Braunschweiler 1997) are 
given in Table 1. 


Voicing Lead Short Lag VOT Long Lag VOT 


Dutch -80 ms: b, d 0-25 ms: p,t 
German 16ms: b,d 51 ms: p, t 
English 32ms: _b,d 59 ms: p, t 


Table 1: VOT in Dutch, German and English 


The laryngeal contrasts in these three languages can be divided into those that ex- 
hibit voicing lead (where voicing begins before the release), short lag VOT 
(where voicing begins at the time of the release or shortly afterwards), and long 
lag VOT (where there is a delay between the release and the beginning of voic- 
ing). In languages such as Dutch, the contrast in initial position is one between 
voicing lead and short lag VOT, while in aspiration languages such as German, 
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the initial contrast is one between short lag and long lag VOT (where long lag 
VOT results in voiceless aspirated stops). Similar to Dutch are languages such as 
French and Spanish. Similar to German are English and most other Germanic lan- 
guages (except Dutch and Germanic languages such as Afrikaans, Frisian, and 
Yiddish). 

The question then naturally arises as to whether the featural representations of 
prevoicing and aspiration languages are different. There are two primary views in 
the literature. The standard approach within current phonological theories assumes 
that a single feature captures the laryngeal contrasts of all languages with a binary 
contrast, generalizing across prevoicing languages and aspiration languages. This 
single feature is either a binary feature [+voice] (Steriade 1995, Wetzels & Mas- 
card 2001), or monovalent [voice] (Mester & Ito 1989, Cho 1990, Lombardi 1995, 
1996). For the purposes of this paper, we refer to these theoretical variants as the 
Single Feature Hypothesis. Laryngeal specifications for Dutch, German and Eng- 
lish for both variants are given in Tables 2 and 3. 


Voicing Lead Short Lag VOT Long Lag VOT 


Dutch [+voice] [—voice] 
German [+voice] [—voice] 
English [+voice] [—voice] 


Table 2: Laryngeal feature representation for Dutch, German and English 
under the Single Feature Hypothesis, using a binary feature [+voice] 


Voicing Lead Short Lag VOT Long Lag VOT 


Dutch [voice] [ ] 
German [voice] [ ] 
English [voice] [ ] 


Table 3: Laryngeal feature representation for Dutch, German and English 
under the Single Feature Hypothesis, using a monovalent feature [voice] 


Note that the laryngeal contrasts of Dutch, German and English are captured with 
the same distinction between [+voice] and [—voice] (or [+voice] and [ ]), but the 
acoustic correlates for these features differ for Dutch versus German and English. 
A second approach to laryngeal features was advanced by Jessen (1989, 1996) 
and Iverson & Salmons (1995, 2003), who argue that laryngeal features are best 
represented with multiple monovalent features such as [voice] and [spread glot- 
tis]. In languages with a binary laryngeal contrast, only one of these (the active 
feature) is underlyingly specified. A language’s selection of laryngeal feature can 
be diagnosed by its active phonological processes, and it tends to correlate with 
VOT properties of stops. This approach will be referred to as the Multiple Feature 
Hypothesis since it assumes two monovalent features, [voice] and [spread glot- 
tis].* According to Iverson & Salmons, prevoicing languages, such as Dutch, rep- 
resent the laryngeal contrast by a monovalent feature [voice], such that voiced 


ACQUISITION OF [VOICE] 45 


stops are specified and voiceless stops are unspecified. Aspiration languages, such 
as German and English, select the active feature [spread glottis], such that aspi- 
rated stops (voiceless) are specified, and unaspirated stops (voiced or voiceless) 
lack specification, indicated by [ ]. The laryngeal specifications under this ap- 
proach for Dutch, German and English are given in Table 4. 


Voicing Lead Short Lag VOT Long Lag VOT 


Dutch [voice] [ ] 
German [ ] [spread glottis] 
English [ ] [spread glottis] 


Table 4: Laryngeal feature representation for Dutch, German and English 
under the Multiple Feature Hypothesis, [voice] and [spread glottis] 


Under the Multiple Feature Hypothesis, the Dutch voicing contrast is expressed 
by representing pre-voiced stops with the feature [voice], while voiceless seg- 
ments lack specification in their phonological representation. Aspiration lan- 
guages such as German and English represent their laryngeal contrast with [spread 
glottis] on aspirated stops, and lack of specification on plain (unaspirated voice- 
less) stops. Note that both [voice] and [spread glottis] are abstract phonological 
features in the sense that their phonetic realizations vary and depend on the posi- 
tion in the word. For example, [spread glottis] is realized with maximal aspiration 
(i.e. fully abducted vocal folds) only in the onset of foot-initial syllables, while 
other positions have weaker implementations (Iverson & Salmons 1995:377). 

Having sketched the Single Feature Hypothesis and the Multiple Feature Hy- 
pothesis, we are now in a position to turn to acquisition, which provides a testing 
ground for theories of laryngeal feature representation. 


2.2 Acquisition of laryngeal contrasts: Previous studies 

Previous studies on the acquisition of voicing have found developmental dif- 
ferences between prevoicing and aspiration languages. With respect to the time 
course of acquisition, it appears that laryngeal contrasts are acquired later in 
prevoicing languages than in aspiration languages (Macken & Barton 1980a,b, 
Davis 1995). While the Dutch contrast is acquired some time around the age of 
three (Kuipers 1993a,b, Beers 1995), the English contrast is acquired relatively 
early, by the age of two (Macken & Barton 1980a). Previous research (Davis 
1995) has indicated a role of acoustic salience in these acquisition differences, 
where prevoicing (voicing lead) is argued to be less salient than aspiration (long 
lag VOT). This suggests that some of the acquisition differences seen in lan- 
guages are to some extent attributable to ease of perception and, possibly, produc- 
tion. However, differences in acoustic salience across languages do not exclude 
the possibility that differences in acquisition are due to different feature represen- 
tations across languages. This paper will explore the phonetic versus phonological 
accounts for the patterns seen in acquisition. 
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2.3 Further assumptions and predictions 

The accuracy of children’s productions can be taken to reflect children’s pho- 
nological knowledge and representations. Children’s production errors have been 
argued to reflect innate universal grammar (Jakobson 1941/1968 and others), 
given that children’s production errors can often be characterized as neutralizing 
to the unmarked value. For example, children often delete final consonants and 
produce CV syllables, e.g., taart /tairt/ ‘cake’ is produced as [ta:] in Dutch (Fik- 
kert 1994), Tag /tag/ ‘day’ as [da:] in German (Kerstin 1;5, see below), and tape 
/teyp/ as [t"e:] in English (Seth 1;7, see below). These production errors can be 
interpreted as reflecting phonological knowledge, such as the knowledge that the 
universally preferred syllable shape is a CV syllable. It is a well-known observa- 
tion (Jakobson 1941/1968) that criteria for markedness based on cross-linguistic 
evidence are supported by language acquisition, as children tend to produce the 
least marked properties before more marked ones (cf. Zamuner 2003, Zamuner, 
Gerken & Hammond 2005). Returning to the example of syllable structure, we 
note that children initially produce the least marked CV syllables, before produc- 
ing more marked syllable shapes, such as those with final consonants (Fikkert 
1994, Levelt et al. 2000). 

Assuming Jakobson’s hypothesis that children’s initial errors reflect the un- 
marked values of phonological features, we can derive a number of predictions 
regarding children’s error patterns, based on the Single Feature Hypothesis and 
the Multiple Feature Hypothesis. These predictions are given in Table 5. 


Representation Unmarked Error Type 
Single Feature [+voice] [-voice] [+voice] — [-voice] 
Hypothesis or 

[voice] [ ] [voice] > [ ] 
Multiple Feature —_ [voice] [ ] [voice]>[ J 
Hypothesis (prevoicing languages) 

[spread glottis] [ ] [spread glottis) >[ ] 


(aspiration languages) 


Table 5: Predictions of the acquisition of laryngeal features based on the 
Single Feature Hypothesis and Multiple Feature Hypothesis 


Recall that the Single Feature Hypothesis makes use of a single binary feature of 
[+voice] or monovalent feature [voice]. The unmarked value for this theory is in- 
variant across languages, because all languages utilize the same feature to repre- 
sent laryngeal contrasts. With a binary feature, this unmarked value is [—voice]. If 
children’s initial productions tend toward the unmarked value, this would predict 
that the direction of errors is cross-linguistically uniform, and should be inde- 
pendent of the language that children are acquiring. Accordingly, children learn- 
ing Dutch, German or English are all predicted to make devoicing errors [+voice] 


ACQUISITION OF [VOICE] 47 


— [-voice]. These errors would affect words starting with (marked) [+voice] con- 
sonants, while words starting with (unmarked) [—voice] consonants should not be 
affected. (For a monovalent feature, predictions are essentially the same, although 
the specifications are slightly different.) In contrast, the Multiple Feature Hy- 
pothesis would predict differences between prevoicing and aspiration languages 
regarding the types of consonants that are affected. Languages with prevoicing 
should display devoicing errors [voice] — [ ] (omission of the feature [voice]) in 
words starting with (marked) voiced consonants, while words starting with (un- 
marked) [ ] voiceless consonants should not be affected, whereas children acquir- 
ing aspiration languages should produce de-aspiration errors [spread glottis] — 
[ ] (omission of the feature [spread glottis]) in words starting with (marked) aspi- 
rated consonants, while words starting with (unmarked) unaspirated consonants 
should not be affected. In Table 6, the predictions are spelled out in terms of pho- 
netic symbols. 


Dutch German English 
Single Feature 
Hypothesis /b/ > [p] /o/ — [p"] /o/ > [pl 
Multiple Feature 
Hypothesis /b/ > [p] /p'/> [p] /p'/=> [p] 


Table 6: Predictions of laryngeal errors for Dutch German and English, based on the 
Single Feature Hypothesis and Multiple Feature Hypothesis 


In sum, using Jakobson’s hypothesis that children’s errors are changes in the di- 
rection of the unmarked, theories of laryngeal representation make different pre- 
dictions about consonants which are prone to undergo errors in acquisition. 
Hence, the acquisition of Dutch versus German and English provides an excellent 
test case for these theories. 


3. Dutch 

To test the predictions of the Single and Multiple Feature hypotheses, we col- 
lected acquisition data from Dutch, German, and English. Different corpora from 
the CHILDES database were studied. We will start with a discussion of the Dutch 
data, which were taken from the CLPF database (Fikkert 1994, Levelt 1994). The 
data from 11 Dutch monolingual children whose ages range between 1;0 and 2;11 
were studied; this involved approximately 20,000 utterances. Examples of voicing 
errors are below. 


(1) Examples of laryngeal errors in Robin’s utterances 


a. douche ‘shower’ [tus] (1;10.21) 
b. dier ‘animal’ [tix] (1;10.21) 
c. beer ‘bear’ [pi] (1;7.13) 
d. bal ‘ball’ [pal] (1;7.13) 
e. baby ‘baby’ [pipi] (1;8.10) 
f. thuis ‘home’ [doeys] (1;5.10) 
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(2) Examples of laryngeal errors in Tom’s utterances 


a. boot ‘boat’ [po] (1;5.0) 
b. bal ‘ball’ [pa] (135.14) 
c. bed ‘bed’ [pet] (135.28) 
d. doen ‘do’ [tun] (231.14) 
e. paard ‘horse’ [bat] (1;3.24) 


Only productions of initial stops /b/, /p/, /d/ and /t/ were considered. These stops 
had to be faithfully realized for place of articulation to be included in our analy- 
ses. Dutch lacks the voicing contrast in velars; hence we did not consider the velar 
stop /k/. Also, fricatives were not included, because in many Dutch regions the 
distinction between voiced and voiceless fricatives is disappearing (Slis & van 
Heugten 1989, Ernestus 2000, Van de Velde et al. 1996, Van de Velde & van 
Hout 2001). 

Not all children were monitored for the same period of time: Figure 1 shows 
the ages of the different children in the database. Note that at the beginning and 
end of the age span, data were collected for only one or two children: Tom at 1;0, 
Leon at 2;9 and Noortje at 3;0. 


Noortje 
—---— Tirza 
—¢— Leonie 


Children 


Catootje 


— — — Leon 


Ages (years; months) 


Figure 1: Breakdown of children’s ages from the CLPF database 
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In Figure 2, the number of target words is given for each child. Note that the 
number of tokens is given here: for types, a similar pattern was found. In the re- 
mainder of the discussion of Dutch, we will only present results from token analy- 
ses. Figure 2 shows that all children attempted more voiced targets (/b/ and /d/- 
initial words, a total of 4871 targets) than voiceless targets (/p/ and /t/-initial 
words, a total of 2244 targets). On the basis of this, one might predict that children 
will be more accurate when producing voiced targets than when producing voice- 
less targets, but we will see that we find the opposite pattern: overall, children 
produced more voiceless than voiced stops. 


1000 
900 
800 
700 
600 
500 


Number of target words 


Figure 2: Number of target words (in tokens) per child 


Figure 3 shows the error percentages of all children when producing word-initial 
stops. These percentages were determined by averaging the percentage error rate 
across children. 


100 
90 
80 -——-? 
70 
60 
50 
40 
30 
20 
10 


0 T T T T T T T T 
1;0- 1;3- 1;6- 1;9- 2;0- 2;3- 2;6- 2;9- 
1;2 1;5 1;8 1311 2;2 235 2;8 2311 


Age (years; months) 


Error % 


Figure 3: Error percentages of all children 
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In Figure 3, all errors children made in the voicing value of word initial segments 
are shown. Thus it collapses errors made both in voiced and voiceless segments. 
Children’s productions become more faithful during the time they were studied. 
Apparently, the overall development curve is U-shaped, but this appearance is 
caused by the fact that at the age periods 1;0, 2;9 and 3;0 there are only data from 
one or two children. Noortje, who is the single child providing data at 3;0, was 
found to be late in her overall phonological development (Fikkert 1994). Hence, 
she causes a rise of the curve at this point. Her error rate for the production of ini- 
tial stops remains quite high during the entire period in which she was studied. 

When errors are broken down by place of articulation, we see that this factor 
does not play any crucial role in the production of the voice value. In Figure 4, the 
errors are broken down for labial-initial words (/b/ and /p/) versus alveolar-initial 
words (/d/ and /t/). There is no significant difference (t-test, p = 0.11, two-tailed) 
between the error rates of these two places of articulation, hence, we find no evi- 
dence that voicing in labials is either more or less difficult than voicing in alveo- 
lars. 


—— Labials (b/p) 
40 ‘ 5 ores - + OF - -Coronals (d/t) 


Error % 
u 
oO 


1;2 15 1;8 1311 232 2;5 2;8 2;11 
Age (years; months) 


Figure 4: Percentage of errors in coronal and labials 


In Figure 5, error rates are split for voicing errors (e.g., /p/ and /t/ produced as /b/ 
and /d/) and devoicing errors (e.g., /b/ and /d/ produced as /p/ and /t/). This clearly 
shows that there were more devoicing errors (M=42.75, SD=22.6) than voicing 
errors (M=9.25, SD=11.01). This difference is significant (t-test, p < 0.01, two- 
tailed), and holds for every stage. For all children, we see that devoicing errors 
persist well into the third year, while the rate of voicing errors drops to almost 
0%. 

We can also examine the extent to which Dutch children’s initial ‘voicing’ 
production patterns reflect the distribution of voicing in the input (van der Feest 
2004, 2007). For this analysis, we analyzed child-directed speech from the van de 
Weijer corpus (van de Weijer 1998). This corpus contains speech directed to a 
child between the ages of 2;6 and 2;9 (a selection of 18 days appears in the cor- 
pus). We conducted (type and token) counts of initial voiced and voiceless stops 
for different places of articulation. Results are summarized in Table 7. 
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Error % 


1;0- 1;3- 1;6- 1;9- 2;0- 2;3- 2;6- 2;9- 
1;2 1;5 1;8 1311 252 235 2;8 2311 


Age (years; months) 


Figure 5: Percentage of voicing and devoicing errors 


Labials Alveolars 
Pp b t d 
types 151 (40.7%) 220 (59.3%) 104 (41.3%) 148 (58.7%) 


tokens 1492 (30.9%) 3342 (69.1%) 1481 (13.6%) 9389 (86.4%) 
Table 7: Distribution of voicing in child-directed speech from van de Weijer corpus 


There is a preference for voiced stops in both type and token counts in child- 
directed speech. This means that the errors patterns seen in Dutch production data 
(voiceless stops are produced before voiced stops) cannot be accounted for by in- 
put frequencies.” 

To summarize these data, we can say that overall, Dutch children acquire the 
voicing system quite late, having not yet completed it by the age of 2;6. We have 
seen that for the acquisition of the Dutch voicing contrast, there is no significant 
effect of place of articulation. Also, although more target words have voiced on- 
sets, children make more errors in voiced than in voiceless initial segments, while 
their overall productions contain more voiceless than voiced segments. These 
findings support the featural specification [voice] for Dutch, assuming that un- 
marked voiceless segments are acquired before the marked voiced segments. 

However, the acquisition data from Dutch are consistent with both the Single 
Feature Hypothesis and the Multiple Feature Hypothesis. Under the latter hy- 
pothesis, voiceless segments are assumed to be unspecified for the monovalent 
feature [voice], and hence, predicted to be acquired before specified voiced seg- 
ments, while under the former hypothesis (assuming a single binary feature 
[+voice]), voiceless segments would also be predicted to be acquired first. Here, 
voiceless segments are specified as [—voice], and would be less marked than 
voiced segments, which are specified as [+voice]. Since predictions from these 
hypotheses are identical, Dutch acquisition data could, in principle, never produce 
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any crucial evidence deciding between these hypotheses. It is important, though, 
that the acquisition patterns cannot be explained on the basis of input frequency. 

Still, the two approaches predict different orders of acquisition for the German 
segments. The Multiple Feature Hypothesis, which assumes that aspiration lan- 
guages such as German represent the laryngeal contrast by [spread glottis], would 
predict that voiced segments are acquired first, since these are unspecified for 
[spread glottis], and hence unmarked. The Single Feature Hypothesis, on the other 
hand, assumes voiceless segments to be universally specified as [—voice], and for 
that reason would predict such (unmarked) segments to be acquired first. We now 
turn to a discussion of German to see which of the two approaches is supported by 
our acquisition data. 


4. German 

German is an aspiration language, which differs from prevoicing languages 
such as Dutch in encoding its two-way laryngeal contrast by aspiration versus 
non-aspiration, at least in word onset position. As pointed out above, the Multiple 
Feature Hypothesis represents the German laryngeal contrast as one of [spread 
glottis] for aspirated ip", versus [ ] for plain /b/ (Jessen 1996, Jessen & Ringen 
2002). Accordingly, the prediction from this hypothesis is that the production er- 
rors of children acquiring German will be predominantly of the /p"/ — [b] (or 
‘lenition’) type, matching a neutralization of the feature [spread glottis]. Phoneti- 
cally, such errors would amount to a failure to realize aspiration on a stop that is 
lexically specified as [spread glottis]. 

German data were collected from the Nijmegen Database in CHILDES 
(MacWhinney 1999).° We considered data from the only child in the database for 
which sufficient phonetic transcription was available, Kerstin (aged 1;3—3;4). 
From this large longitudinal database (containing approximately 25,000 utter- 
ances) we selected Kerstin’s productions between ages 1;0 and 2;2, which allowed 
us to track her development with respect to laryngeal specifications. 

Some characteristic examples of Kerstin’s de-aspiration errors are given be- 
low: 


(3) Examples of ‘voicing’ errors in Kerstin’s utterances 


a. Papa ‘daddy’ baba (1;5.7, 1;6.20, 1;7.24, 1;10.3, 1;11.20, 2;0.5) 
b. Puppe ‘doll’ bibbaa (1;3.22), bubbaa (1;3.22) 

c. Tag ‘day’ daa (1;5.3) 

d. Teddy ‘Teddy’ diddie (1;5.6), dide (1;6.13), didi (1;7.24, 1;8.22) 
e. Turm ‘tower’ dum (2;3.1) 


Note that the informal transcription indicated in the corpus of items such as Papa 
as ‘baba’ and Tag as ‘daa’ suggests pre-voicing, rather than just de-aspiration. 
This was presumably due to a language-specific bias on the part of the transcrib- 
ers, who may have perceived unaspirated stops as lenis stops /b, d/. We will inter- 
pret the transcriptions conservatively as evidence for de-aspiration only. In com- 
parison, only a very small number of errors in the opposite direction was found. 
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The single clear example is Becher (1;5.17) transcribed in the corpus as ‘peschel’, 
presumably phonetically [p"efol]. The near complete absence of initial devoic- 
ing/aspiration in Kerstin’s utterances contrasts with the situation in English, as we 
will see in section 6. 

Quantitative analysis confirms that Kerstin’s errors are almost exclusively of 
the ‘voicing’ type: an unaspirated realization of stops corresponding to aspirated 
stops in the adult language. See Figure 6. 


Error % 
vu 
oO 
[J 


60 —— Devoicing 
F - - OF - - Voicing 
40 : 


1;0 1;3 1;6 1;9 2;0 


Age (years; months) 


Figure 6: Onset voicing versus devoicing in Kerstin’s errors 


Errors of the ‘devoicing’ type were extremely rare, while ‘voicing’ errors are 
abundant. This matches earlier observations for German acquisition (Grijzenhout 
& Joppen-Hellwig 2002). The prediction of the Multiple Feature Hypothesis is 
thus borne out. 

To what extent does Kerstin’s initial ‘voicing’ pattern reflect the statistics of 
the input? We carried out an analysis of child-directed speech in the same CHIL- 
DES corpus, based on utterances from caretakers present during recording ses- 
sions in the relevant periods (Kerstin’s age 1;0—1;12). We conducted (type and 
token) counts of initial voiced and voiceless stops for different places of articula- 
tion. Results are summarized in Table 8. 


Labials Alveolars Velars 
Pp b t d k g 
types 32 80 52 88 94 112 
(28.57%) (71.43%) (37.14%) (62.86%) (45.63%) (54.37%) 
tokens 421 458 157 2361 638 676 
(20.9%) (79.1%) (6.24%) (93.76%) (48.55%) (51.45%) 


Table 8: Distribution of voicing in child-directed speech from Kerstin's corpus 


There is a noticeable trend toward voiced initial stops (especially for labials) in 
child-directed speech during Kerstin’s second year.’ Kerstin’s error pattern is thus 
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compatible with the input she received. But although Kerstin’s input matches the 
direction of her errors, input statistics alone cannot account for the error pattern. 
This is because Kerstin produced virtually no errors of the devoicing type during 
her second year, a much stronger result than might be expected on the basis of the 
input pattern alone. 

In sum, the error pattern of a German child between ages 1;0 and 2;3, showing 
abundant voicing errors in initial stops, but hardly any devoicing errors, is natu- 
rally accounted for by the Multiple Feature Hypothesis as omission of the active 
feature [spread glottis], but not by the Single Feature Hypothesis, while an input- 
based account offers only a partial explanation. 


5. Interpretation and further predictions 

Summarizing so far, we found that the acquisition of the initial voicing con- 
trast in Dutch is rather slow, and completed beyond the age of 2;6. Errors are pre- 
dominantly of the ‘devoicing’ type. For German, the initial contrast is acquired 
earlier, and seems completed by the age of 2;0. Errors are overwhelmingly of the 
‘voicing’ type (presumably, lenition or de-aspiration). 

Two plausible interpretations of these findings suggest themselves, one phono- 
logical and the other phonetic. A strongly phonological account, which we have 
been assuming thus far, seeks a featural basis for the observed developmental dif- 
ferences. On the assumption that errors in early productions target the unmarked 
feature value, findings for Dutch and German would favour the Multiple Feature 
Hypothesis over the Single Feature Hypothesis, since only the former predicts dif- 
ferences in error patterns between the languages. The Multiple Feature Hypothesis 
models the Dutch voicing contrast on a monovalent feature [voice], and hence 
would correctly predict production errors of Dutch children to result in featurally 
unspecified stops, which are phonetically interpreted as ‘voiceless’. The German 
laryngeal contrast, as opposed to Dutch, is based on a monovalent feature [spread 
glottis], which predicts German children’s production errors to favour unspecified 
stops, phonetically realized as ‘unaspirated’ (that is, lenis and voiceless). The Sin- 
gle Feature Hypothesis, on the other hand, represents both languages by a single 
feature [voice], and hence would not predict any differences in the directionality 
of laryngeal errors between German and Dutch developmental patterns, as both 
languages would represent their contrasts by a single feature. (Note that, as indi- 
cated in Table 6, differences may occur between phonetic errors patterns in voic- 
ing and aspiration languages due to the language-particular implementation of the 
specifications [voice] and[_ ].) 

However, an alternative articulatory interpretation might be proposed, which 
would explain differences in error patterns between the languages, and hence 
would leave no room for testing the two representational hypotheses discussed 
above. According to what we will refer to as the ‘Articulatory Effort Hypothesis’, 
young children’s initial preference for short lag VOT (that is, unaspirated voice- 
less stops) is due to lack of articulatory skills necessary to produce stops with ei- 
ther long lag VOT (aspiration) or short lead VOT (prevoicing). This would cor- 
rectly predict that early German productions show a lack of aspiration, while early 
Dutch productions show a lack of prevoicing. To account for the developmental 
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differences between German and Dutch (i.e., age of acquisition of the contrast), 
the additional assumption would be needed that prevoicing is more difficult to 
produce than aspiration (Kewley-Port & Preston 1974, van Alphen, this volume). 
Alternatively, a perceptual account may be given following Davis (1995) and oth- 
ers: on the basis of greater perceptual salience of long lag VOT as compared to 
short lead VOT, children acquire the laryngeal contrast in aspiration languages 
earlier than in prevoicing languages. 

The Articulatory Effort Hypothesis explains properties of children’s errors in 
relation to the target language, but its validity need not rule out a role of featural 
representations in the explanation of error patterns. We are thus facing the follow- 
ing question: when considering children’s laryngeal errors, how to distinguish ef- 
fects of articulatory effort from effects of feature specifications? Here is an at- 
tempt to tease the two kinds of effects apart. 

The Articulatory Effort Hypothesis would predict that errors correlate with the 
overall motoric complexity of a target. As is well-known, the articulatory effort 
required for the realization of a gesture may also depend on its position in an ut- 
terance. For example, it is much easier to maintain voicing in intervocalic contexts 
than in word-initial or final contexts. However, motoric effort should be inde- 
pendent of the presence of a target elsewhere in the utterance, specifically when 
they are not adjacent (for example, when two consonants are separated by a 
vowel), or when the targets are articulatorily diverse. Cross-linguistically, the ar- 
ticulatory gestures for laryngeal contrasts and the acoustic cues are quite varied. 
Cues include VOT, closure duration, duration of the preceding vowel (Keating 
1984). Within a language, choice of laryngeal gesture may depend on a segment’s 
position in the word, in the syllable, or on neighbouring segments. For example, 
English realizes laryngeal contrasts in onset mainly by VOT, and laryngeal con- 
trasts in coda mainly by duration of the preceding vowel, closure duration, or glot- 
talization. In sum, the Articulatory Effort Hypothesis would predict few interac- 
tions in error patterns between articulatorily heterogeneous positions, such as the 
onset and coda in English. 

In contrast, a phonological account would predict contrastive specifications to 
appear in children’s error patterns, which abstracts from fine-grained phonetic re- 
alization depending on position. For example, a phonological account would pre- 
dict cases of ‘laryngeal harmony’ between onset and coda, in which only contras- 
tive features would harmonize, not redundant ones.® (Cross-linguistic studies on 
laryngeal cooccurrence patterns include MacEachern 1997, Hansson 2001, Rose 
& Walker 2001.) It should be emphasized that by ‘harmony’ we generally refer to 
any kind of interaction between segments which produces identical contrastive 
feature specifications, without implying autosegmental spreading resulting in 
doubly-linked features. As Fikkert & Levelt (2002) argue, consonant harmony at 
early stages of development may be driven by a general requirement for stops to 
be featurally similar, regardless of whether similarity is achieved by spreading, by 
default, or by phonologically active features. 

Radical underspecification of contrastive features would make an additional 
prediction that production errors reflect activity of the specified feature only, to 
the exclusion of the unspecified value. Under the Multiple Feature Hypothesis, 
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laryngeal features are monovalent, mainly to capture the observation that voice- 
less unaspirates are unmarked both in prevoicing and in aspiration languages. 
Languages differ as to which feature is specified: [voice] in prevoicing languages 
such as Dutch, and [spread glottis] in aspiration languages such as English. Un- 
derspecification thus creates predictions about harmony, since only unspecified 
segments should be the targets, assimilating non-locally to specified segments. 

Different harmony effects would be predicted to occur, depending on the two 
featural approaches under comparison. Consonant harmony of place of articula- 
tion in children’s early productions is typically anticipatory (Menn 1971, Smith 
1973, Pater & Werle 2001, 2003, Fikkert & Levelt 2002), which leads us to ex- 
pect a similar asymmetry for laryngeal harmony. For this reason, we will consider 
predictions for a hypothetical anticipatory harmony pattern, in which laryngeal 
errors in the onset anticipate the coda’s specification. 

Under a Single Feature Hypothesis with a binary feature [+voice], symmetrical 
error patterns within a language would be predicted, since both values are active, 
and potentially induce errors. This would predict a pattern with both devoicing 
and voicing in onsets, depending on whichever feature value is specified in the 
coda. 


(4) Harmonies predicted by the Single Feature Hypothesis [+voice] 


aPVB > BVB 


[+voice] [+voice] 


b BVP > PVP 


[—-voice] eageieel 


Note that a variant of the Single Feature Hypothesis based on a monovalent fea- 
ture [voice] would only predict harmony of the type (4a), not (4b). Thus, if antici- 
patory laryngeal harmony were to be found in children’s English, this could only 
be /PVB/ — [BVB]. This makes a strong prediction, which allows a rather 
straightforward test of these two versions of the Single Feature Hypothesis. 

The Multiple Feature Hypothesis predicts patterns to be asymmetrical, and to 
correlate with a language’s ‘active’ feature, either [voice] or [spread glottis]. That 
is, languages whose specified feature is [spread glottis] would be predicted to dis- 
play only one kind of error: onset devoicing triggered by a voiceless coda /BVP/ 
— [PVP], but not onset voicing triggered by a voiced coda /PVB/ — [BVB], be- 
cause voiced codas would be laryngeally unspecified [ ]. 
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(5) Harmony predicted by the Multiple Feature Hypothesis: aspiration languages 


B V P — P Vv P 


[spread glottis] [spread glottis] 


Under the Multiple Feature Hypothesis, prevoicing languages should only display 
harmonies involving voiced segments, /PVB/ — [BVB]: 


(6) Harmony predicted by the Multiple Feature Hypothesis: prevoicing languages 


P VB > BV B 


[voice] iyotee) 


Note that the Multiple Feature Hypothesis and the monovalent version of the Sin- 
gle Feature Hypothesis make similar predictions for prevoicing languages. Predic- 
tions differ between the ‘monovalent’ frameworks, however, for aspiration lan- 
guages. If the Multiple Feature Hypothesis is correct, English uses monovalent 
[spread glottis], and hence should display anticipatory harmony of the ‘devoicing’ 
type /BVP/ — [PVP], whereas if the Single Feature Hypothesis is correct, antici- 
patory harmony should be of the ‘voicing’ type, /PVB/ — [BVB]. 

In sum, to compare predictions made by the Single Feature Hypothesis and 
Multiple Feature Hypothesis, we must distinguish monovalent and binary variants 
of the latter. If children’s productions were to contain systematic patterns of voic- 
ing harmony, but not devoicing harmony, this pattern would be compatible with 
both the Multiple Feature Hypothesis and the monovalent version of the Single 
Feature Hypothesis, but it would form evidence against its binary version. Next, if 
harmony of the voicing and devoicing type were to systematically co-occur in a 
child’s productions, this would support the binary version of the Single Feature 
Hypothesis, but form evidence against both monovalent accounts. Finally, if we 
were to find that children’s production errors consistently display devoicing har- 
mony, while lacking voicing harmony, this asymmetrical pattern would favour the 
Multiple Feature Hypothesis, but constitute evidence against the monovalent and 
binary variants of the Single Feature Hypothesis. The logical options are summa- 
rized in Table 9: 
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Voicing harmony Both voicing and Devoicing harmony 
only devoicing harmony only 
SFH, binary contra pro contra 
[+voice] (circumstantial) (circumstantial) 
SFH, monovalent pro contra contra 
[voice] 
MFH pro contra pro 


[voice] or [sg] 
Table 9: Evidential status of hypothetical harmony patterns for the Single Feature Hypothesis 
(binary and monovalent) and Multiple Feature Hypothesis 


Let us now turn to a test case for the Multiple Feature Hypothesis against the Sin- 
gle Feature Hypothesis: English. 


6. English 

English acquisition data may serve as a test case, since this language precisely 
meets the conditions under which harmonic anticipations of laryngeal features 
might occur. First, English matches German (but not Dutch) in being an aspiration 
language. Hence, under the Multiple Feature Hypothesis [spread glottis] is speci- 
fied, predicting this feature to be active in children’s early phonologies. Indeed, 
English-learning children display de-aspiration errors (Menn 1971). Second, 
unlike German, English lacks syllable-final laryngeal neutralization, so that coda 
obstruents are specified contrastively. Since onsets and codas both license laryn- 
geal specification, harmony effects become potentially visible. This meets the 
logical requirement which must be fulfilled for testing for positional interactions 
involving [spread glottis].° 


6.1 Earlier studies 

Earlier studies (such as Smith 1973) provide evidence for initial voicing and 
final devoicing in children’s productions. Let us first turn to some data from 
Smith (1973). During the first half of his third year (ages 2;2—2;6), Amahl realized 
most of his initial stops as plain (voiceless unaspirated), by a general neutraliza- 
tion of initial laryngeal contrasts. In this period, initial neutralization affects 
voiced targets (for example, bell), as well as voiceless ones (for example, pen). 
(We adopt Smith’s transcription.) 


(7) Initial stops realized as voiceless unaspirated, irrespective of targets 
(ages 2;2—2;6) 
a. bell [be] (232) 
b. pen [ben] (2;2) 
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In the same period, Amahl also neutralized most word-final stops to voiceless. 
Since this has audible consequences only for voiced targets (e.g. [mob] for knob), 
the overall effect is one of final devoicing. 


(8) Final stops realized as voiceless (unaspirated), irrespective of targets 
(ages 2;2—2;5) 
a. knob [mob] (2;2) 
b. stop [dop] = (232) 


Amahl’s development during this period is shown in Figure 7. The error percent- 
ages were calculated as the proportion of target words of which the laryngeal re- 
alization, [p], [b], or [b], deviated from the target specification, /p/ or /b/.!° For 
target /b/, for example, any realizations deviating from it, either [p] or [b], were 
considered as devoicing errors. 


—— Onset devoicing 


-- > - + Onset voicing 


Error % 


—— Coda devoicing 


— O — Coda voicing 


2;2 2;3 2;4 2:5) 2;6 2:7 2;8 


Age (years; months) 


Figure 7: Amahl’s development (2;2—2;8) 


Note that the final contrast is acquired slightly earlier than the initial contrast, by 
about a month. The error rate of final devoicing drops sharply round age 2;5, 
while that of initial neutralization (as shown in the two topmost lines) follows at 
the distance of about a month. 

The observation that Amahl’s laryngeal contrast stabilizes slightly earlier in 
final than in initial position may come as a surprise, given that the child is 
unlikely to receive input with the laryngeal distinction directly realized on the fi- 
nal stop, whereas initial stops have robust VOT cues.'' However, other cases are 
known of children acquiring English who mastered the laryngeal distinction in 
codas before it emerged in onsets (Clark & Bowerman 1986:55, fn. 5, Vihman & 
Ferguson 1987:383, Fey & Gandour 1982, but see Stoel-Gammon & Buder 1999). 
More generally, other consonant types, such as fricatives and liquids, are more 
likely to be first acquired in final position (Ferguson 1978, Stoel-Gammon 1985). 

At best, we can offer speculative accounts of the coda-onset lag in Amahl’s la- 
ryngeal development, as no acoustic data are available for verification. One ac- 
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count would attribute the lag to production factors rather than lexical representa- 
tion; Amahl may have mastered control over vowel duration, the primary realiza- 
tion of laryngeal contrast in coda, before the gestural coordination between release 
and voice onset which is required for aspiration. Under this scenario, lexical rep- 
resentations in onset and coda are stable at an earlier stage, setting the stage for 
rapid across-the-board changes once the relevant gestures are mastered. Indeed, 
Amahl shows a rapid development of the laryngeal contrast in onset and coda, 
both of which are completed within approximately three months. Nevertheless, an 
explanation of the coda-onset lag based on developing lexical representations 
cannot be ruled out, because laryngeal error rates vary somewhat between indi- 
vidual lexical items, an observation which is difficult to explain under a produc- 
tion-only account. For example, during period 10-11 (at the age of 2;6) all three 
occurrences of bread had neutralized onsets, while all three occurrences of Braj (a 
name) were realized correctly. Hence, it is quite possible that Amahl’s laryngeal 
contrast was lexically represented in final position before it emerged in initial po- 
sition. 

Relative strength of the laryngeal contrast in coda position will become a ma- 
jor factor in our central case study, to which we turn next. 


6.2 Seth: a case study 

The data in this section were taken from a large CHILDES database (Wilson 
& Peters 1988), containing approximately 12,500 utterances, with a total number 
of 39,000 words. All data were from a single monolingual child, named Seth, 
aged between 1;7—4;1, who was acquiring American English. Utterances in the 
database are matched with target words in plain orthography, and are phonetically 
transcribed at a level allowing for qualitative and quantitative analysis of voicing 
patterns. The original sound files were kindly made available to us in digitized 
form by Ann Peters and Brian MacWhinney for further transcription and acoustic 
analysis. 

We monitored Seth’s development between ages 1;7 and 2;5, the period dur- 
ing which major changes in the laryngeal contrast took place, and at the end of 
which Seth’s productions of the contrast were largely indistinguishable from 
adults. 


6.2.1 Initial devoicing. We first focused on word-initial position and collected all 
of Seth’s productions of content word” targets containing an initial voiced or 
voiceless stop. This allowed us to search for factors which possibly influenced the 
proportion of two major types of error: initial ‘voicings’ (actually, de-aspirations 
resulting in plain stops) and initial ‘devoicings’ (actually, aspirations). The result- 
ing dataset contained 227 types and 4354 tokens. Table 10 shows type and token 
distributions of initial voiced and voiceless target stops, for different places of ar- 
ticulation. 
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Labials Alveolars Velars Total 
9) b t k g 

types 34 56 45 48 18 227 
(37.78%) (62.22%) (63.38%) (72.73%) (27.27%) 

tokens 902 628 689 553 668 4354 
(58.95%) (41.05%) (42.98%) (45.29%) (54.71%) 


Table 10: Distribution of initial stop targets in Seth’s productions (types & tokens) 


A representative set of examples of initial errors in Seth’s early utterances are 


given below: 


(9) Initial devoicing in Seth’s utterances 


Labials 
a. bark 
b. boy 
c. bike 
d 


. backpack 


(10) Initial voicing in Seth’s utterances 


Labials 
penny 
play 


ao oe 


play 


Velars 
kiss it 
kitchen 
cookie 
cool 


Po et et 


[pa:k] 1;8 
[paj] Bee) 
[pajk] 1;10 
[pokpok] 1;11 


[kijs] 1;8 
[ko] 1;8 
[karrit]  2;2 
[ket] 2:33 


[bonij] 1;11 
[bwedja] 1311 


peanut butter [bi bada] 1;11 


[bejdl] 2:1 


[giset] 1;8 
[gisan] 139 
[gukij] 1;10 
[guw] 1;10 


Alveolars 

e. Dabee [tabij] 
f. doughnut [towna:] 
g. dog [tagij] 
h. drink [trunk] 
Alveolars 

e. tape [dejp] 
f. trunk [drank] 
g. tan [don] 
h. tell [doa] 


1;7 
2;0 
232 
235 


1;9 
1;10 
2;0 
2;0 


Although both types of initial errors are abundant, Seth makes more devoicing 
errors than voicing errors in initial position, as shown in Table 11. This presents 
token counts of voiced and voiceless targets (/B/ and /P/) and their realization 
(voiced [B] or voiceless [P]) over a succession of four three-month periods (1;7— 
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2;5), where the last column gives the results collapsed across periods. In the nota- 
tion we use, B refers to voiced (labial, coronal, or velar) stops, and P to voiceless 
stops. Error percentages are indicated in cells for unfaithful realizations. For ex- 
ample, for the period 1;7—1;9, the corpus contains 415 targets with initial voiced 
stops, 372 of which were realized faithfully, and 43 of which (10.4%) were de- 
voiced. 


Age 1;7-1;9 1;10-1;12 2;0—2;2 233-2;5 1;7-2;2 

Realiz. [B] [P] [B]_ [P] [Bl _ [PI [B]__[P] [B]_ __ [Pl 
target 372 43 273 «17 690 31 775 9 2210 100 
/B/ 
target 24 537 +16 627 9 5511 379 ~=—50 2094 
/P/ 
chi- yx’ = 13.8 x? =6.7 y° = 7.6 y=2.4 x? = 13.6 
square 

p<0.001 p<0.01 p<0.01 (n.s.) p <0.001 


Table 11: Distribution of initial errors in Seth’s productions 


Over the monitored period (1;7—2;5), both error types decreased continuously. To 
determine whether devoicing errors were more frequent than voicing errors, a se- 
ries of chi-square tests was conducted. Results were significant for all periods ex- 
cept the period of 2;3-2;5. 


Figure 8 shows the direction of initial errors in tokens (devoicing versus voicing) 
as it develops between the ages of 1;7 and 2;5. 


—?— Devoicing 


-- 1 - - Voicing 


Error % 


1;7-1;9 1;10-1;12 2;0-2;2 2;3-2;5 


Age (years; months) 


Figure 8: Direction of initial errors in Seth’s productions 
Note that devoicing errors occur about twice as frequently as voicing errors 


throughout Seth’s development. 
The dominance of devoicing errors extends to types, as Table 12 shows:'* 
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[B...] [P...] 
target /B.../ 68 32 
target /P.../ 19 108 
Chi-square 7° = 9.3 p< 0.01 


Table 12: Distribution of voicing and devoicing errors 
in Seth’s productions (types) 


The high proportion of devoicing errors in the early productions of an English 
learning child apparently clashes with our previous observations for German, 
where ‘voicing’ (actually de-aspirating, lenition) errors prevailed. It seems to run 
against the typological prediction made in section 2, according to which aspiration 
languages would display errors of the ‘voicing’ (de-aspiration) type, so that Eng- 
lish would parallel German. Upon closer inspection, however, we see that Seth’s 
initial devoicings are not simply neutralizations to the unmarked value. 


When we differentiate Seth’s initial devoicing errors according to their con- 
texts in the word, it becomes clear that following consonants play a major condi- 
tioning role. Figure 9 shows that initial devoicing is much more frequent in targets 
in which a voiceless obstruent follows (e.g., bark, drink, geese) than in targets 
which have no following voiceless obstruent (e.g., boy, dog, go): 


—— Voiceless C 
following 


- - £1 - «No voiceless C 
following 


Error % 


1;7 - 1;9 1;10 - 1;12 230 - 2;2 233 - 2;5 


Age (years; months) 
Figure 9: Initial devoicing in Seth’s productions: the role of following consonants 


We used a chi-square test to test the difference in error rates between target cate- 
gories (words with a following voiced obstruent, a following voiceless obstruent, 
or no following obstruent), over the entire period (1;7—2;5), and found a strong 
effect (y* = 45.4, p < 0.001). 

Next, to establish whether voiced obstruents differed from sonorants in their 
effects on initial devoicing, we broke down the category ‘no voiceless consonant 
following’. Table 13 shows devoicing in three target types: (a) /B...P/ targets, 
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which have a following voiceless obstruent (e.g. back, diaper, grass), (b) /B...B/ 
targets, which have a voiced obstruent (e.g. baby, dog, give), and (c) /B...R/ tar- 
gets, which have a following sonorant or vowel (‘R’) (e.g. ball, door, gone). For 
each period, Seth’s productions are broken down into faithful [B...] and unfaithful 
[P...] realizations of the target, and error rates are indicated in ‘unfaithful’ cells. 


Age 1;7-1;9 1;10—1;12 2;0—2;2 233-2;5 1;7-2;2 

Realiz. [B] [P] [B] [P] [B] [PJ [B] [P] [Bl] [P] 
/B...P/ 198 33 132 10 197 16 156 6 683 65 
/B...B/ 77 6 65 5 256 7 370 1 768 19 
/B...R/ 97 4 76 a) 237 8 249 2 659 16 


Table 13: Initial devoicing as a function of following consonants 


Throughout Seth’s development, initial devoicing rate is highest for /B...P/ tar- 
gets. Over the four periods, this error type reaches a much higher average (of 
8.7%) than targets /B...B/ and /B...R/ (both 2.4%). Note that in all three catego- 
ries, a gradual overall reduction of devoicing errors occurs. 


The difference between /B...P/ targets and the other targets /B...B/ and 
/B...R/ turned out to be statistically significant, as Table 14 shows. This compares 
initial devoicing rates for three targets (/B...P/, /B...B/, /B...R/) for all four peri- 
ods. Chi-square tests were conducted for the token distribution in Table 12, estab- 
lishing that /B...P/ targets undergo devoicing significantly more often than targets 
/B...B/ and /B...R/, while any differences in devoicing rate between /B...B/ and 
/B...R/ targets are non-significant. 


1;7-1;9 1;10-1;12 2;0—2;2 233-2;5 1;7—2;2 
/B...P/ versus x7 = 2.8 x? = 0.001 7? =6.0 x? = 10.3 x? = 29.2 
/B...B/ (n.s.) (n.s.) (p< 0.025) (p< 0.01) (p < 0.001) 
/B...P/ versus y= 7.6 7 = 2.0 V=41 =44 7° = 26.4 
/B...R/ (p < 0.01) (n.s.) (p < 0.05) (p < 0.05) (p < 0.001) 
/B...B/ versus x" = 0.9 V=1.7 7 = 0.2 y= 0.9 7? = 0.003 
/B...R/ (.s.) (n.s.) (n.s.) (n.s.) (n.s.) 


Table 14: Initial devoicing as a function of following consonants 


In sum, devoicing in /B...P/ targets is significantly more frequent than for other 
targets across periods. It is more frequent than devoicing for /B...B/ targets in two 
out of four periods, and more frequent than devoicing for /B...R/ targets in three 
out of four periods. Also, /B...B/ and /B...R/ targets cannot be distinguished in 
terms of the initial devoicing rate. 
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We interpret these results as follows. In targets that begin with a voiced ob- 
struent, a following ‘P’ (voiceless) segment triggers initial devoicing, as compared 
to a following ‘B’ (voiced) or ‘R’ (sonorant) segment, which behave as inactive 
with respect to initial devoicing. Initial devoicing in /B...P/ targets results in out- 
puts with identical laryngeal features between the word onset (which undergoes it) 
and a following [spread glottis] obstruent (which triggers it). This is arguably a 
case of laryngeal harmony of the type that was predicted in section 5 (Table 9). 
Consequently, this finding supports predictions of the Multiple Feature Hypothe- 
sis: English, a language using [spread glottis] to represent its laryngeal contrast, 
should display activity of this feature in laryngeal harmony, if such harmony were 
to be found. 

Note also that segment types predicted to be phonologically inactive by the 
Multiple Feature Hypothesis, ‘B’ (voiced obstruents) or ‘R’ (sonorants), are in- 
deed inactive. Devoicing rates for target words with following ‘B’ or ‘R’ seg- 
ments fall well below the rate observed for /B...P/ targets. Their shared behaviour 
is predicted by lack of specification for [spread glottis] under the Multiple Feature 
Hypothesis. Note that no other featural theory under consideration predicts a 
shared behaviour, since voiced obstruents will be marked [voice], while sonorants 
will not bear a distinctive laryngeal representation. 

How to account for the fact that initial devoicing marginally occurs in the 
other targets /B...B/ and /B...R/? We attribute the initial devoicing rate for these 
targets, which amounts to 2.4% on average in the period 1;7—2;5, to the instability 
of early lexical representations. That is, the early lexicon contains incomplete fea- 
tural information for lexical items, manifesting itself in variable productions, with 
both voiced and voiceless realizations. We suggest that during early production, 
lexically incomplete featural information is supplemented by three sources. First, 
context-free markedness effects (that is, omission of [spread glottis]) amount to 
context-free neutralization, which we observed as voicing in /P.../ targets. Sec- 
ond, copying of the active feature [spread glottis] amounts to laryngeal harmony 
in /B...P/ targets. Thirdly, a certain amount of random selection occurs. Initial de- 
voicing in targets /B...B/ and /B...R/ occurs when random selection fills in in- 
complete features in early lexical representations. The fact that initial devoicing is 
facilitated by, but not categorically restricted to, /B...P/ targets, can thus be ex- 
plained by an interplay of harmony effects and random specification. 

To verify the amount of initial ‘devoicing’ (actually, aspiration) in Seth’s pro- 
ductions, we now turn to the results of phonetic analysis. 


6.2.2 Phonetic analysis. In order to determine whether devoicing results in a full 
merger with target voiceless stops, we carried out narrow phonetic transcriptions 
and conducted acoustic measurements of VOT. First, all stop-initial items (be- 
tween ages 1;7 and 1;9) were extracted from the digitized sound material. Next, 
79 items were removed due to bad quality. For the remaining 605 items, narrow 
phonetic transcriptions were made by five transcribers (the current authors), and 
an acoustic analysis (VOT measurements) was conducted. Figure 10 shows mean 
VOT values (in msecs) for Seth’s voiced and voiceless stops (ages 1;7—1;8). 
Seth’s values closely approximate the adult VOT values. 
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Wi Voiced 
B Voiceless 


Mean VOT values in milliseconds 


labials alveolars velars 


Place of articulation 
Figure 10: Mean VOT values in milliseconds for labials, alveolars and velars 


The following criteria were applied for determining an item’s error status. A ‘de- 
voicing error’ was defined as an item whose target has a voiced onset and which 
was realized with VOT > 30 ms (for labials and alveolars), or with VOT > 50 ms 
(for velars), and which was also categorized as ‘voiceless’ by a native listener 
(one of the current authors). A ‘voicing error’ was defined as an item whose target 
has a voiceless onset and which was realized with VOT < 30 ms (for labials and 
alveolars), or with VOT < 50 ms (for velars), and which was also categorized as 
‘voiced’ by the native listener. 

Analysis showed that Seth’s devoicing errors resulted in initial stops (e.g. 
geese, bark) with mean VOT values which approximate mean VOT values for 
target voiceless stops (e.g. tape). See Figure 11 below. On the basis of these find- 
ings, we feel safe in assuming that devoicing errors are ‘categorical’, in the sense 
that devoiced target consonants are acoustically indistinguishable from faithfully 
realized voiceless consonants. 


6.2.3 Initial voicing. We now turn to target words whose initial consonants are 
voiceless, and look into patterns of initial voicing, in order to find out whether 
voiced obstruents behave as phonologically inactive, as predicted by the Multiple 
Feature Hypothesis. 
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HM Voiced 


BD Voiceless 


"geese’ ‘bark’ ‘tape’ 


Figure 11: Errors are categorical 


First, we are interested in the question whether Seth’s productions show any 
effects of anticipatory voicing harmony, analogously to initial devoicing. 


20 —@— Voiced C 

18 following 

7 - - OF - «No voiced C 

14 following 
2 12 
S 10 
i= 
BY 8 

6 Cr 

4 a z 

2 - 0 

0 LL “1: | | 


1;7 -1;9 1;10 - 1;12 2;0 - 2;2 233 -2;5 
Age (years; months) 


Figure 12: Initial voicing: the role of following voiced consonants 


Surprisingly, hardly any voicing harmony occurs. The initial voicing rate for 
/P...B/ targets falls significantly below that of other targets /P...P/ and /P...R/ (x? 
= 11.7, p < 0.001). That is, the prediction from the Multiple Feature Hypothesis 
that voiced obstruents are phonologically inactive is confirmed. We momentarily 
put aside the question of what causes the harmony-avoiding pattern in /P...B/ tar- 
gets, and break down the data for /P...P/ and /P...R/ targets, respectively. 

Data are broken down for following consonants ‘P’, ‘B’ and ‘R’ in Table 15, 
which is the counterpart of Table 13 (for initial devoicing). 
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Age 1;7-1;9 1;10-1;12 2;0—2;2 233-2;5 1;7-2;2 

Real: [B] [P] [B] [Pl [B]_ [P] [Bl [P] [B] [P] 
/P...P/ 23 278 37 295 3 304 1 242 34 1119 
/P...B/ 1 155 O 214 0 66 0 29 1 464 
/P...R/ 0 104.29 118 6 181 O 108 15 511 


Table 15: Initial voicing in relation to following consonants (tokens) 


/P...P/ and /P...B/ targets display a gradual decrease of initial voicing errors dur- 
ing Seth’s development. As noted above, the case of /P...B/ is most interesting 
since it shows no voicing errors except one in the first period. As compared to 
/P...B/, target /P...R/ shows more errors. 

Differences between categories turn out to be actually much smaller than in 
the case of initial devoicing, when measured by chi-square tests. Table 16 com- 
pares initial voicing rates of targets /P...P/, /P...B/, and /P...R/. 


1;7-1;9 1;10-1;12 2;0—2;2 233-2;5 1;7-2;2 
/P...P/versus y°=10.1 ¥7=5.0 7? =0.7 ¥°=0.1 4? =11.7 
/P...B/ (p< 0.01) (p<0.025) (ms.) (n.s.) (p < 0.001) 
/P...P/versus y= 8.4 0 =5.7 x? = 3.2 V7 =0.4 7° = 0.01 
/P...R/ (p< 0.01) (p< 0.025) (n.s.) (n.s.) (n.s.) 
/P...B/ versus? = 0.7 y= 15.6 x7 = 2.2 7 = 10.8 
/P...R/ (n.s.) (p < 0.001) (n.s.) (n.s.) (p < 0.001) 


Table 16: Initial voicing in relation to following consonants (tokens) 


Phonological inactivity of final /B/ (voiced obstruents) becomes clear from Tables 
15 and 16. Since the rate of initial voicing for /P...B/ targets falls far below that of 
other targets /P...P/ and /P...R/, we may safely conclude that final voiced obstru- 
ents fail to condition anticipatory voicing harmony. Non-occurrence of voicing 
harmony is straightforwardly predicted by the Multiple Feature Hypothesis on the 
assumption that English is a [spread glottis] language (see again Table 9). 

Table 16 also shows significant differences during the first two periods be- 
tween on the one hand, /P...P/ and /P...B/, and on the other hand, /P...P/ and 
/P...R/. Since in terms of its featural specification [spread glottis], /P/ forms no 
natural class with /B/, nor with /R/, both of which are unspecified, these findings 
are compatible with the Multiple Feature Hypothesis. Note, however, that the rela- 
tive ease of initial voicing in /P...P/ targets is not predicted by this hypothesis, 
which has nothing to say about non-harmonic effects. If initial voicing amounts to 
context-free neutralization, the question is why final /P/ apparently facilitates it. 
Below we will offer an answer based on the maintenance of lexical contrast. But 
first we turn to another surprising result shown in Table 16. 

The surprising finding is that /P...B/ and /P...R/, which ought to be indistin- 
guishable by the non-specification of /B/ and /R/, nevertheless significantly differ 
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in period 1;10-1;12, where /P...B/ hardly shows any initial voicing, while /P...R/ 
does. Upon closer inspection, we find that the inhibitory effect in /P...B/ targets is 
caused by a single word, please, having 170 occurrences in this period, all of 
which fail to undergo initial voicing. Since please is a highly frequent item in 
Seth’s corpus, the inactivity of its initial consonant may be attributed to its rela- 
tively stable lexical representation. The high rate of initial voicing in /P...R/ tar- 
gets in the middle periods (1;10—2;2) is mostly due to alveolar stops: five cases of 
turn, all realized in rapid succession in a single recording session, and two cases 
of tan. Interestingly, the voiced realizations of these items were always preceded 
by nasals (for example, [san don] ‘sun tan’, and [n da’ af] ‘N turn off’, where the 
nasal is apparently a realization of ‘want’), suggesting a post-nasal voicing proc- 
ess in Seth’s phonology. Since alveolar stops regularly alternate with flaps in 
American English, these may also have been early attempts at flapping. In sum, 
we suggest that initial voicing in /P...R/ targets is due to phonological factors ap- 
plying across word boundaries, rather than phonological activity of sonorants fol- 
lowing within the target. 

In sum, there is no evidence for voicing harmony triggered by /P...B/, con- 
firming the predictions of the Multiple Feature Hypothesis. Target /P...R/ shows 
no initial voicing except for a temporary unexplained increase during a single pe- 
riod. For /P...P/, some development is visible, with high levels of initial voicing 
during earlier periods, followed by a sharp decline. It thus seems that segment 
types /P/, /B/ and /R/ cannot be distinguished as to their effect on initial voicing. 
This finding is predicted by the central hypothesis that ‘voicing’ errors amount to 
a context-free neutralization, a delinking of [spread glottis] — [ ], not conditioned 
by segments elsewhere in the word. 

Finally, we turn to the unexplained low proportion of initial voicing errors in 
/P...B/ targets. As in German, we attribute initial voicing to context-free omission 
of [spread glottis], reflecting the instability of early lexical representations. Neu- 
tralization occurs at an average level of 2.9% in /P...P/ targets, whereas it is 
strongly inhibited in /P...B/ (average 0.2%). Our speculative account again starts 
from the hypothesis that initial voicing amounts to delinking of [spread glottis]). 
We observe that the key difference between the targets /P...P/ and /P...B/ is that 
the former contains two [spread glottis] elements, and the latter only one. We 
suggest that blocking of initial delinking in /P...B/ targets reflects an avoidance of 
wholesale deletion of [spread glottis]. 


(11) _ Initial delinking resulting in complete loss of specified [spread glottis] 


P...B > B...B 


[spread glottis] 
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The avoidance of initial delinking in /P...B/ and /P...R/ may thus be construed as 
a way of maintaining the laryngeal contrast by preserving the single occurrence of 
[spread glottis]. 


(12) Initial delinking resulting in partial loss of specified [spread glottis] 


P...P — B...P 


[spread glottis] [spread glottis] 


In contrast, /P...P/ targets have two occurrences, so that initial delinking still pre- 
serves one. 

In sum, the observed asymmetry between onset devoicing and onset voicing in 
Seth’s early word productions gives evidence from acquisition for the activity of 
[spread glottis] in English. Phonologically active coda obstruents are lexically 
specified for this feature, while voiced obstruents and sonorants are unspecified, 
hence phonologically inactive. The Multiple Feature Hypothesis predicts the ob- 
served asymmetry, while other models under consideration fail to account for it. 

However, two important questions remain. First, the issue of directionality of 
harmony effects arises: is laryngeal harmony from coda to onset matched by har- 
mony in the reverse direction, triggered by the onset, and effected in the coda? 
Second, a major issue arises as to whether the laryngeal harmony as witnessed in 
/B...P/ targets is due to lexical specification, or rather to surface realization. That 
is, we assumed that the influence of [spread glottis] is located at the level of lexi- 
cal representation, but have not presented any evidence bearing on this. We will 
discuss both issues below.“ 

To answer these questions, we need to look at word onsets and codas sepa- 
rately. For this reason, we will now turn to a selection of Seth’s productions, his 
monosyllables. 


6.2.4 Directionality of harmony: Seth’s monosyllables. Although the Multiple 
Feature Hypothesis makes no predictions’? about the directionality of laryngeal 
harmony, we are nevertheless interested in directionality effects in Seth’s data, for 
two reasons. First, directional asymmetries in laryngeal harmony might provide 
clues about early lexical representations, related to the relative strength of specifi- 
cation for onset and coda consonants. Second, directionality in laryngeal harmony 
would strengthen the similarity with other types of consonantal harmony in lan- 
guage acquisition and in speech production. A well-known asymmetry in direc- 
tionality is found in consonant harmony in children’s productions (Menn 1971, 
Smith 1973, Pater & Werle 2001, 2003, Fikkert & Levelt 2002) as well as in adult 
speech errors (Fromkin 1973, Shattuck-Hufnagel 1979, Stemberger 1991a,b) in- 
cluding anticipations of voicelessness. We will first address the issue of whether 
laryngeal harmony is unidirectional in Seth’s productions, affecting onsets only, 
or bidirectional, affecting codas as well. 
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Before comparing contextual effects in onset and coda devoicing, we briefly 
address the relative stability of the laryngeal contrasts in onsets and codas in 
Seth’s monosyllables. Figure 13 shows laryngeal error rates for final and initial 
stops, respectively, during four stages. 


10% —@— Initial 


ae -- 1 - -Final 


Error % 
a 
eR 


1;7 - 1;9 1;10 - 1;12 230 - 2;2 233 - 234 


Age (years; months) 
Figure 13: Laryngeal error rates for initial and final obstruents 


During the earliest stage (1;7-1;9), when Seth’s anticipatory devoicing pattern 
was at its peak, error rates for initial obstruents are slightly (although not signifi- 
cantly) above those for final ones. This suggests that the laryngeal contrast in final 
position is relatively stable as compared to initial position in early stages of pho- 
nological development. Speculatively, this would tie in with our earlier finding 
about Amahl’s development (86.2, Figure 7), who acquired a stable laryngeal con- 
trast in coda slightly before it stabilized in onset. 

The relative stability of the coda contrast naturally leads to the expectation that 
onset devoicing may be facilitated by voiceless codas, but not vice versa. This ex- 
pectation stems from the assumption (see §6.2.1) that laryngeal harmony is related 
to the instability of early lexical representations in the sense that harmony 
amounts to the influence of a relatively stable lexical specification (typically, a 
coda) onto a less stable one (typically, an onset). To test the predicted lack of 
harmony in coda devoicing, we compared onset and coda devoicing in Seth’s 
monosyllables, assessing the degree to which each is contextually conditioned by 
a voiceless obstruent. From the earlier mentioned database we extracted all targets 
(151 types, 3064 tokens) starting with a stop, which was voiced in about half of 
the cases (51.0% of types and 52.8% of tokens). We also extracted all targets (146 
types, 2406 tokens) ending in a stop, which was voiced in about one third of the 
cases (32.9% of types, 28.9% of tokens). 

To assess the relative strength of contextual factors in onset and coda devoic- 
ing, we first counted voiced and voiceless realizations of /B.../ monosyllables into 
three categories, whose target ended in a voiceless obstruent, voiced obstruent, or 
sonorant (including vowels). We will refer to these categories as /B...P/, /B...B/ 
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and /B...R/ respectively. We then counted voiced and voiceless realizations’ of 
/...B/ monosyllables, for the three target categories /P...B/, /B...B/, and /R...B/. 
Characteristic examples of coda devoicing in Seth’s database are [bet] ‘bed’ (1;9) 
in the category /B...B/, and [fajnt] ‘find’ (2;2) in the category /P...B/. The results 
are shown in Tables 17 and 18. 


[B...] [P...] 
/B...P/ 544 42 
/B...B/ 402 9 
/B...R/ 553 11 


chi-square x? = 25.15, p < 0.001 


Table 17: Initial devoicing in Seth’s monosyllables, relative to context 


[B...] [P...] 
/B...P/ 90 5 
/B...B/ 312 25 
/B...R/ 65 1 


chi-square x? = 3.48, (ns.) 


Table 18: Final devoicing in Seth’s monosyllables, relative to context 


Again, a strong effect is found for the contextual conditioning of initial devoicing 
(x° = 25.15, p < 0.001). This effect is due to the high proportion of initial devoic- 
ing in /B...P/ targets, as compared to other targets. Moreover, as compared to ini- 
tial devoicing, final devoicing is not contextually conditioned. In spite of some 
minor differences between targets, there is no effect of onset type (y°= 3.48, n.s.), 
supporting our hypothesis that final devoicing is context-free. 

These results point to an interesting asymmetry in the directionality of laryn- 
geal harmony. While onset devoicing anticipates the coda’s voicelessness, coda 
devoicing shows no perseveration of onset voicelessness. That is, laryngeal har- 
mony shares its directionality with other consonantal harmony processes found in 
acquisition and adult speech errors (see references at the beginning of this sec- 
tion). 

Again we argue that the positional interactions in Seth’s productions give evi- 
dence for an abstract representation of the laryngeal contrast, which is unified be- 
tween the onset and the coda. As we saw earlier, the phonetic realization for la- 
ryngeal contrast in English strongly differs between onsets and codas. In onsets, 
VOT is the primary cue, while in codas, duration of the preceding vowel, closure 
duration, and (for some dialects) glottalization, are main cues. Since anticipation 
of the coda’s voicelessness by the onset cannot be reduced to anticipation of the 
articulatory gestures involved, we have a case that the positional interaction oc- 
curs at a more abstract level: that of contrastive specification. This gives evidence 
from acquisition for representations of laryngeal contrasts involving monovalent 
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features. More precisely, activity of ‘voiceless’ obstruents (with non-activity of 
‘voiced’ obstruents and sonorants) supports the specification of the feature [spread 
glottis], rather than [+voice]. 


6.2.5 The level of harmony: lexical versus surface specification. The final issue is 
whether initial devoicing in /B...P/ targets is due to lexical specification, or an 
effect of surface realization. We already saw some evidence from Seth’s early 
productions that laryngeal harmony is governed by lexical specification (see 
§§6.2.1 and 6.2.4). First, the phonological activity of voiceless obstruents, to the 
exclusion of voiced obstruents and sonorants, demonstrated in section 6.2, points 
to a relatively abstract level of representation, which is underspecified for laryn- 
geal features. Second, we discovered two further properties of harmony, its direc- 
tionality (anticipatory nature) and its non-local nature (passing across an interven- 
ing vowel), which both point to lexical representation as the relevant level. Both 
properties are reminiscent of processes which are assumed to be sensitive to lexi- 
cal representation, such as speech errors involving place of articulation and voice 
(see Stemberger 1991a,b). 

To further test our hypothesis that lexical specification, not surface realization, 
is the relevant level conditioning harmony, we checked whether /B...B/ target 
monosyllables whose final obstruent was produced voiceless due to final devoic- 
ing had an increased chance of undergoing onset devoicing as compared to 
/B...B/ targets whose final consonant remained voiced. As expected, we found no 
such effect. Thus, for predicting the likelihood for initial devoicing in a /B...B/ 
item, surface specification of the final obstruent was about equally as informative 
as its lexical representation. Tentatively, lexical specification alone (not surface 
realization) may account for laryngeal harmony. 

Interestingly, we also found that likelihood of onset devoicing increased as a 
function of unfaithfulness in the coda, regardless of whether this involved dele- 
tion, devoicing, or other segmental changes. This ‘unfaithfulness effect’ was 
found for /B...B/ targets (y° = 13.61, p < 0.001), as well as /B...P/ targets (x? = 
6.11, p < 0.025). It need not have a phonological interpretation; instead we sug- 
gest a role for general factors affecting accuracy of realization of the word or ut- 
terance as a whole.'” 

The tentative conclusion that lexical representations are involved in initial de- 
voicing is supported by evidence from Seth’s monosyllables that shows that onset 
devoicing is, to some extent, sensitive to lexical frequency of individual items." 
To test for a correlation between item frequency and onset devoicing, we placed 
all forty /B...P/ and /B...B/ targets in a rank-order by frequency in Seth’s corpus, 
split the list into two halves (of most frequent and least frequent items), and calcu- 
lated the error rates for each list. We found that initial devoicing occurs less often 
in the most frequent items: only 5.1% of the most frequent items underwent initial 
devoicing, versus 10.7% of the least frequent items. The frequency effect (y* = 
4.01, p < 0.05) is, of course, compatible with developing lexical representations. 
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7. Conclusions 

This study of the acquisition of laryngeal contrast in three Germanic languages 
with binary laryngeal contrasts (Dutch, German, and English) offers evidence 
supporting the language-specific selection of laryngeal features [voice] and 
[spread glottis]. The Multiple Feature Hypothesis correctly predicts differences 
between Dutch and German in children’s error patterns in initial obstruents, as the 
result of neutralization to the unmarked value depends on the language-specific 
laryngeal feature: loss of [voice] for Dutch, and loss of [spread glottis] for Ger- 
man. However, we observed that the Articulatory Effort Hypothesis could also 
explain the asymmetry between Dutch and German, assuming that prevoicing and 
aspiration both pose articulatory challenges to the young child, which are avoided 
by devoicing (in Dutch) and de-aspiration (in German), respectively. For Dutch 
and German, we argued that the error patterns in acquisition cannot be fully ex- 
plained on the basis of input frequency. 

Focusing on evidence from the acquisition of English, we argued that an ac- 
count based on the phonological feature [spread glottis] explains the observed 
asymmetry between voiceless obstruents and other consonants. In Seth’s early 
productions, we found anticipatory devoicing, a case of laryngeal harmony, trig- 
gered by following voiceless obstruents but not by following voiced obstruents, 
nor by sonorants. This finding was interpreted as to support the Multiple Feature 
Hypothesis, i.e. languages with binary laryngeal contrasts differ in their ‘active’ 
laryngeal features, either [voice] or [spread glottis]. For English, a language which 
selects [spread glottis] as its active laryngeal feature, this correctly predicts that 
only voiceless obstruents trigger harmony. 

Finally, we argued that articulatory effort alone cannot account for observed 
effects of anticipatory devoicing because of its non-local nature and the abstract- 
ness of the specification involved. Anticipatory devoicing arguably involves an 
abstract level of featural organization, that of contrastive specification. We pre- 
sented additional evidence to support the hypothesis that the level of representa- 
tion that is relevant for laryngeal harmony is lexical representation: its non- 
locality, its sensitivity to lexical frequency, and its insensitivity to the presence of 
the triggering segment in the output. 

Perhaps the main interest of harmony patterns in children’s productions re- 
sides in the possibility of testing the nature of lexical representations in early 
childhood. Even within a single language, laryngeal contrasts may be realized by 
rather different articulatory gestures (which correspond to different acoustic pa- 
rameters) in syllable onset and coda. Evidence from children’s early productions 
for laryngeal harmony between coda and onset, two positions which differ in the 
articulatory implementations of the laryngeal specification, suggests that young 
children (starting round the age 1;6) already construct phonological representa- 
tions that abstract away from the phonetic detail which differentiates specific po- 
sitions. 
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Notes 

1. An alternative to Iverson & Salmons’ multiple laryngeal features is that of Avery & Idsardi 
(2001) who express the features in more phonetic terms in three dimensions: Glottal Width (for 
aspiration languages like English), Glottal Tension (for voicing languages, such as Dutch) and 
Larynx Height for languages which have ejectives or implosives in their segment inventories. In 
this paper, we use the terminology [voice] and [spread glottis] to express the phonological con- 
trasts. 


2. Here we focus on word-initial stops. 


3. VOT interacts with place of articulation features (Lisker & Abramson 1964): dorsal stops have a 
longer VOT than both coronal and labial stops; in turn, coronal stops have a longer VOT than la- 
bial stops. 


4. Other monovalent features may also be used to capture laryngeal contrasts cross-linguistically, 
such as [constricted glottis] or [stiff vocal cords]. Languages may have more than one feature; for 
example, Thai, which has a three-way laryngeal contrast, employs both features [voice] and 
[spread glottis]. 


5. One could also look at the frequency of voiceless vs. voiceless stops collapsed across different 
prosodic positions, i.e. collapsed across word-initial, word-medial and word-final position (see 
Zamuner 2007). Based on these types of calculations, one finds that voiceless stops are overall 
more frequent than voiced stops. Input frequencies would then match the patterns of production 
seen in Dutch acquisition data. 


6. Data in the Nijmegen database were collected and transcribed by Susan Powers, Jiirgen Weis- 
senborn, Wolfgang Klein, Heike Behrens, and Max Miller. 


7. The stronger trend toward voicing in tokens can be attributed to the fact that many prefixed 
words start with /ba/ or /goa/. A related issue, which we leave for future work, is how prosodic fac- 
tors (mainly, stress) affect the salience of laryngeal contrasts in the input. For example, contrasts in 
onsets of stressed syllables may be more salient than those in unstressed syllables, such as pre- 
fixes. 


8. As far as we know, no gestural accounts have been proposed which assign a single laryngeal 
gesture to onset and coda, thus spanning an entire syllable. Such accounts would necessarily be 
more abstract than the standard accounts, moving closer to a phonological representation. 


9. Dutch and German have word-final neutralization, hence monosyllables cannot be used to test 
predictions on laryngeal harmony. Logically, laryngeal harmony might affect initial and medial 
consonants in polysyllables in these languages, but unfortunately, relevant cases in the Dutch and 
German databases were too rare to base any conclusions on. 


10. Analysis is necessarily based on types, since Smith (1973) does not present token counts. 
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11. Amahl’s mother spoke English as her fourth language, after Hindi, Bengali and Marathi. Ac- 
cording to Smith (1973:7-8), her speech was characterized by ‘fuller voicing of voiced obstruents’. 


12. Stop-initial function words rarely occurred during the early stages of Seth’s development. 
Moreover, targets for function words were difficult to establish. 


13. If a single type occurred as both voiced and voiceless, it was coded as unfaithful. For example, 
bark was coded as [P...] because it occurred with both voiced and voiceless initial stops. 


14. A third question, which we will briefly address at the end of this section, concerns item- 
specificity: to what extent is the devoicing effect restricted to particular target words? 


15. Of course, we predict that if right-to-left (perseverative) laryngeal harmony were to be found, 
then it should be of the devoicing type, rather than of the voicing type, since English employs 
[spread glottis] as its active feature. However, coda voicing effects were generally too rare in 
Seth’s productions for this prediction to be testable. 


16. Other realizations, such as those involving onset deletion, were left out of consideration. 


17. Some /B...P/ items were found in which initial devoicing occurred even though the final ob- 
struent was left deleted, for example, [pli] ‘blink’ and [kij] ‘geese’ [1;7]. Although such cases may 
seem to constitute strong evidence for the relevance of lexical representation, their relevance is 
somewhat undermined by the observation that deletion in /B...B/ items also increased chances of 
devoicing in onset. Both cases can be explained by the general unfaithfulness factor discussed 
above. 


18. Thanks to Joe Stemberger for suggesting this to us. 
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Exceptions to Final Devoicing 


Marc van Oostendorp 
Meertens Instituut / KNAW 


Some dialects of Dutch show systematic exceptions to final devoicing in the first 
person singular of verbs ending in a long or tense vowel and a fricative. This obser- 
vation raises questions about the morphology — what makes the first person singular 
of verbs so special? —, and about the phonology — what makes fricatives after long 
vowels so special? As to the morphological side of things, this paper argues that the 
first person singular suffix, which used to be a schwa, is still present as an abstract 
vocalic position. From the phonological point of view, I argue that Dutch fricatives 
have a phonological length contrast rather than a voicing contrast. Since (empty) 
syllabic positions and consonant length both are expressed in the phonotactic dimen- 
sion, it is expected that they interact. 


1. Introduction: data and the issue 

All West-Germanic languages display the effects of the process of final de- 
voicing (FD).' This process is illustrated in (1) for standard Dutch: an underly- 
ingly voiced obstruent devoices when it occurs at the end of a syllable. The ob- 
struent is underlyingly voiced because it appears as voiced in forms where it does 
not end the syllable, for instance because a following morpheme starts with a 
vowel. In (1) the contrast is illustrated by the verb stem bet ‘to wet’, which has an 
underlying /t/, and the noun bed ‘bed’ which has an underlying /d/. The [d] ap- 
pears when the vowel-initial plural suffix is added. 


(1) bet /bet/ [bet] ‘(I) wet’ —/betton/ [beton]  betten ‘(we) wet’ 
bed /bed/ [bet] ‘bed’ —/bed+on/ [bedon] bedden ‘beds’ 


As far as we know, there are no Dutch dialects that do not have FD at all. On 
the other hand, there are quite a few dialects that display exceptions to FD in cer- 
tain well-defined morphological contexts (De Schutter & Taeldeman 1986, De 
Vriendt & Goyvaerts 1989, Goeman 1999, van Bree 2003). 

A relatively widespread phenomenon found both in eastern and southern dia- 
lects of Dutch (including Flemish) is that the final fricative of a verbal stem with a 
long vowel in the final syllable remains voiced in the first person singular. The 
following facts are from fieldwork carried out in 2003 (Schoemans & van Oos- 
tendorp 2004): 
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(2) 

Town final -v final -y 
Beuningen tk yalorw’ ‘I believe’ 

Noord-Deurningen Ik sary ‘I saw’ 
Rossum Ik bli:v ‘I stay Ik mary ‘I may’ 


Schoemans & van Oostendorp (2004) did not find any comparable data with 
voiced coronal fricatives, and there were more instances of labial fricatives than of 
velars.’ In some cases, the segment alternating with voiceless [f] was transcribed 
phonetically as [w] rather than [v] (while transcriptions like [w’] may indicate 
partial devoicing) — a fact that we will briefly discuss in section 3 below. 

The data in (2) present the type of exceptional behaviour that will be studied in 
this article. At least two questions arise. In the first place, why are fricatives (after 
long vowels) involved, rather than stops? Secondly, why is the first person singu- 
lar involved rather than some other morphological form? Both issues will turn out 
to be closely related, although we will concentrate here mostly on the former one. 

Based on data from Dutch dialects in the so-called GTR database’, and dialec- 
tological work by van Bree (2003), Goeman (1999), Weijnen (1991) and Schoe- 
mans & van Oostendorp (2004), the main argument will be that a sufficiently so- 
phisticated view of representations obviates the need for a complex analysis using 
constraints promoting paradigm uniformity. 

This paper is organised as follows: I will first lay out a morphological ap- 
proach to exceptions of this type and argue that an approach in terms of paradigm 
uniformity — devoicing would differentiate the first person singular too much from 
other forms — faces severe problems; an (Items-and-Arrangement) approach which 
assumes that all underlying morphemes are expressed in the phonological output 
representation, on the other hand, seems more successful. In section 3 I discuss 
the analysis of voicing in fricatives, and show how the phonological behaviour of 
these elements can be made to follow from their representation: if we assume that 
fricatives prefer to co-occur with the distinctive feature [spread glottis], and if 
[spread glottis] segments are preferably long (both of which claims have been 
made in the literature), the relevant facts are direct consequences. In section 4, I 
will frame the debate in terms of Optimality Theory for the sake of concreteness, 
even though the issue, and most of the arguments pro and contra, are quite inde- 
pendent from this particular choice. Section 5 presents a short conclusion. 


2. Two approaches to the interaction between morphology and phonology 
There are two main approaches to capturing the special effect of the first per- 
son singular: 


1. Paradigmatic. The first person singular should resemble ‘related’ forms as 
much as possible; application of final devoicing would increase the differ- 
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ences between forms in the paradigm to an unacceptable level (cf. van 
Bree 2003). 

2. Structural. The first person singular has some property which blocks final 
devoicing (cf. Zonneveld 1978). 


These two approaches correspond roughly to two different views on morphol- 
ogy (the Word-and-Paradigm vs. the Item-and-Arrangement model; Hockett 1958, 
Robins 1959): the paradigmatic approach seems to be consistent with a view of 
morphology as a function that relates words, as essentially unstructured units, to 
each other, while the structural approach fits best with a view of morphology in 
which it is assumed that words are structured units of morphemes. 

Much modern literature within Optimality Theory (OT) converges on para- 
digmatic approaches to facts such as the ones that are currently under analysis 
(see, for instance, Benua 1997, Burzio 1998 and McCarthy 2002b). However, this 
choice does not necessarily follow; Items-and-Arrangement views of morphology 
could also be formalized within OT. 

In order to compare these two approaches to phonology-morphology interac- 
tion, we need some analytical tools to deal with final devoicing. Several of these 
are familiar from the current literature (see also other articles in this volume); here 
we will use the following (Lombardi 1991; cf. also Steriade 1997): 


(3) | FINALDEVOICING (FD) 
Voiced obstruents are only allowed in a position preceding a tautosyllabic 
sonorant. 


This constraint describes the effect we need directly. It has to be ordered above 
the relevant faithfulness constraint, here IDENT-IO(voice), which disallows chang- 
ing the feature value for voicing, in order to be active in the grammar. The faith- 
fulness constraint is given in (4) and the ranking in (5): 


(4)  IDENT-IO(voice) 
Underlying specifications for voicing should be respected. 


(5) FD» IDENT-IO(VOICE) 


The tableau in (6) shows how the correct output is derived: 


(6) | input: /bed/ FD IDENT-IO(voice) 
rw [bet] * 
[bed] *! 
[ped] *| 
[pet] a 
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In the paradigmatic approach, we need a special faithfulness constraint in 
which the output is not compared to the input, but to a different output form (most 
likely, another form in the paradigm). Such a constraint could take various 
shapes — see Benua (1997), Kager (1999), McCarthy (2002b) for illustrative pro- 
posals — but we will formulate it as follows: 


(7)  IDENT-OO(VOICE) 
The specification for [voice] of the form under evaluation should be the 
same as the specification for [voice] in some designated other form in the 
paradigm. 


I will assume for the sake of the argument that the “designated other form” in 
the case of ik geleuv ‘I believe’ is the infinitive geleuven: 


(8) | input: /yalo:v/ IDENT-OO(VOICE) FD IDENT-IO(VOICE) 
[yalev:on] 


we [yale] 
[yolo:f] *! e 


What would the alternative, structural analysis look like? This approach would 
assume that, even though the vowel of the 1SG suffix has disappeared, it has not 
done so without leaving a trace. For instance, it might assume that the suffix con- 
sists of a phonetically empty vowel position (symbolised as @ below), protecting 
the consonant from being devoiced (because in this approach, too, devoicing 
would only affect consonants in absolute coda position). Diachronically, this 
would involve the following development: 


(9) o G o o o 
A hk A Ak A 
yol @ vo > yo leo vd 


The resulting configuration would not be subject to final devoicing, at least 
under some definitions of this constraint since it would not occur in syllable coda. 
This approach” requires some degree of abstractness within the phonological rep- 
resentation, viz. a zero morpheme in the shape of an empty vowel, but it should be 
noted that it does not require reference to the notion of a paradigm, the exact defi- 
nition of which has also been a subject of debate. 

Let us compare the structural approach to the paradigmatic one. There are at 
least three problems with the latter. The first of these concerns the geographical 
positioning of the phenomenon involved. From dialect-geographic study, it ap- 
pears that exceptions to FD of the type discussed above are always found in the 
vicinity of areas where the 1sG suffix is still overt. Tilligte, for example, borders 
on an area where ik geleuve still occurs; the form has even been reported as an 
indigenous variant for Tilligte itself (Goeman 1999). The same is true for southern 
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dialects displaying the process, such as Ghent (cf. Goossens 1977): they are al- 
ways in the vicinity of dialects in which the schwa is still pronounced, or the 
schwa variant can even still be found, variably, in the dialect in question. This can 
be clearly seen in Figure 1, which displays a map of the (European) Dutch- 
speaking language area (The Netherlands and Flanders), where the circles denote 
dialects with ‘exceptions’ to final devoicing for any of the verbs ‘to live’, ‘to stay’ 
and/or ‘to give’ (past tense): boxes indicate dialects in which the 1sc form of any 
of these verbs ends in a schwa:° 


BB suffix (129) so _ oa 


@ voiced fricative (13) cil = 


Figure 1: Schwa-deletion and voiced fricatives 


Especially the fact that the pattern is found in two unconnected areas is very 
suggestive. If we assume that geography mirrors language change in this case, this 
is a very strange and unexpected state of affairs. We have to posit three stages of 
development: 

1. ik geleuve. During this stage IDENT-OO(VOICE) is not active, because 
schwa protects the fricative from becoming devoiced. Since it is usually 
assumed that faithfulness constraints are lowly ranked by the language 
learner, unless there is evidence to the contrary, the constraint will have a 
low position in the hierarchy during this stage. 

2. ik geleuv. In this stage, some constraint is responsible for schwa deletion 
(e.g. FINAL-C, McCarthy 2003, Swets 2004 and references there); at ex- 
actly the same time, IDENT-OO(VoICcE) should become highly ranked, 
even though it is not clear if there is a formal connection between the two 
constraint movements. 
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3. ik geleuf. At this point, IDENT-OO(VOICE) should again be lowly ranked 
since it no longer affects the phonology of any segment. 


The output-to-output faithfulness constraint has to move up and down during 
language change; it is unclear what is the relation between this fact and the disap- 
pearance of schwa. In particular, we could expect IDENT-OO sometimes to move 
up in dialects without recent schwa apocope, so that we would find individual 
spots where exceptions to FD are not surrounded by places where there is still a 
schwa. As pointed out above, this does not seem to happen. 

Notice that the structural approach does not suffer from this problem. The lan- 
guage change is incorporated, as it were, into the phonological representations. 
The order of events in this case would be that the schwa would first be deleted, 
leaving behind its structural position. After a while this unstable situation would 
be resolved by loss of the empty position, and the fricative would end up in a co- 
da. Within this approach, there is no reason why a coda fricative would ever 
‘spontaneously’ create an empty vowel at the end of the word. 

The second problem with the paradigmatic approach concerns the structure of 
the paradigm. In the presentation of the paradigmatic account above, we assumed 
that we could establish in some way that the “designated other form” in the para- 
digm in the case of ik geleuv is geleuven; the latter form is the infinitive and plural 
form of the verb (for all persons) in the Standard language. On closer scrutiny, 
this view is problematic. In some of the dialects under discussion, it is not clear 
what the role of this form in the paradigm is. The dialects surrounding Tilligte, for 
instance, have a plural ending in —t, so that actually all other forms in the present 
tense are geleuft with a devoiced cluster. 


(10) 1sG _ geleuv 1PL_ geleuft 
2sG _ geleuft 2PL_ geleuft 
3SG  geleuft 3PL_ geleuft 


The infinitive is geleuven in these dialects, but it is not clear why it should be 
the infinitive that should block final devoicing. There also is no apparent reason 
why the influence of this ‘designated other form’ should be restricted to the 1SG; 
the other forms in the paradigm could have become geleu[vd], but as far as I have 
been able to ascertain, this form is never attested. In fact, the 2sG form is geleuf in 
many dialects if it precedes the subject (e.g. in questions where there is subject 
inversion). Yet the fricative is never voiced in this case.’ 

Because the notion of a paradigm does not play a crucial role in the structural 
approach, this problem does not affect it. The form geleuv is evaluated independ- 
ently of other forms in the paradigm, and it does not matter what these other forms 
are, although one could argue that in the absence of any other voiced fricative 
form, the language learner would no longer have a reason to posit such a form in 
the first place. 

The third problem with a paradigmatic account is that exceptions to final de- 
voicing always involve fricatives. In this approach, it is unclear why fricatives 
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should be more sensitive to paradigmatic influence than other consonants. We 
could reformulate the relevant faithfulness constraint in the following way: 


(11) IDENT-OO(Voice, fricatives) 
The specification for [voice] in fricatives for the form under evaluation 
should equal the specification for [voice] in some designated other form in 
the paradigm. 


A constraint of this type would not rank very highly on the scale of explana- 
tory adequacy. Since fricatives do not support voicing contrasts as easily as stops 
do, one might actually expect the opposite state of affairs. However, the struc- 
tural approach does not immediately offer a valid alternative. There is no reason 
why a plosive could not occur in the onset of an otherwise empty syllable in the 
same way as a fricative. This problem needs to be solved first before we can use 
the status of fricatives as a fatal objection to the paradigmatic approach, and this is 
therefore the topic of the next section. 


3. Voicing and fricatives in Dutch 

At first sight, it may seem absurd that fricatives are involved in exceptions to 
FD, regardless of our morphological theory: phonetically they are less compatible 
with voicing than plosives (cf. again fn. 7 above). It is even the case that in those 
cases in which exceptions to final devoicing are not triggered by the morphology, 
we seem to find the inverse pattern: fricatives devoice before plosives do. In a 
survey of Dutch dialects, van Bree (2003) mentions that 


“not all potential target sounds take their turn at the same time: devoicing 
clearly takes place earlier with fricatives than with occlusives (...); this might 
be related to the fact that the unmarked state for fricatives is voicelessness.”” 


One could object to this that the fricatives in these dialects tend to have a so- 
norant type of realization; e.g. the voiced [v] was sometimes transcribed as [w] in 
our data. One might hypothesize that this means that the sound is really a sonorant 
/w/ underlyingly; but such an assumption does not explain why the segment de- 
voices in other contexts, for instance when it occurs next to a voiceless obstruent. 
Other sonorants never devoice at the end of the word, so that we conclude that at 
least phonologically /v/ still counts as an obstruent. On the other hand, the realiza- 
tion [w] for an underlying fricative [v] is not unexpected; such a pronunciation 
will enhance the perceptibility of the voicing of this segment.'” 

We will have to take into account the fact that there is a difference between 
those cases in which morphology is involved, and in which fricatives tend to get 
voiced, and those cases in which it is not, and in which fricatives are not voiced at 
all. For now, let us concentrate on the former case. Interestingly, there is another 
well-known example of a language in which fricatives constitute exceptions to 
ED, viz. Turkish (Kaisse 1986, Rice 1993):"! 


88 MARC VAN OOSTENDORP 


(12) stops may alternate: 


yiiziik ‘ring, NOM.SG’ ~ yiiziigtii ‘ring, ACC.sG’ 
kagit ‘sheet, NOM.SG’ ~ kagidi ‘sheet, ACC.SG’ 
sarap ‘wine, NOM.SG’ ~  garabi ‘wine, ACC.SG’ 


fricatives do not alternate: 
az ‘little’ 
ev ‘home’ 


There is arguably a special relation between fricatives and voice if we look at 
it from a cross-linguistic perspective. According to Maddieson (1984: 48), “bila- 
bial, dental and palatal non-sibilant fricatives are found to occur without a voice- 
less counterpart more often than with one”. 

Van Oostendorp (2002) argues on the basis of phonotactic distribution that in 
some West-Germanic dialects — and in particular in Dutch — the opposition voiced 
vs. voiceless for fricatives should be replaced by the opposition short vs. long.’ 
Phonetically, these oppositions are clearly related (Slis & van Heugten 1989; van 
Rooy & Wissing 2001). This explains facts such as those above: in Turkish, frica- 
tives are not sensitive to FD since their representation does not include the feature 
[voice] (an idea which is clearly also present in the approach of Rice 1993 re- 
ferred to above). The fact that short (i.e. voiced) fricatives should occur more fre- 
quently than long (i.e. voiceless) ones is also hardly surprising from this point of 
view. 

At first sight, it might seem problematic to replace the voicing opposition with 
a length opposition completely in Dutch — at least in Standard Dutch and the dia- 
lects under consideration here —, but there is evidence that shows that the two di- 
mensions are correlated, e.g. the fact that short lax vowels (almost) exclusively 
occur before voiceless fricatives and long (tense) fricatives (almost) exclusively 
before voiced ones. 


(13) knuffel [kncef] ‘hug’ = *[kneerf] 
heuvel [herv] ‘hill’ = *[hoev] 


This pattern can be most easily accounted for if we assume that long vowels 
occupy two moras, short vowels only one and if voiceless (i.e. long) fricatives are 
represented as moraic. Stressed syllables must then consist of maximally (and mi- 
nimally) two moras:"° 


(14) ao b. *o c. Oo d. *o 
\ | \ /\ 
; Mm Mm ran man f 
| | 
kneef hoev he:v kng: f 
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In (14a), a short vowel is followed by a ‘long’ consonant, which is fine. In 
(14b), the short vowel is followed by a short consonant; this structure is too short 
— it contains less than the minimum of two moras. In (14c), a long vowel is fol- 
lowed by a short consonant, which is again fine. In (14d), a long vowel is fol- 
lowed by a long consonant, which results in an illicit three-mora structure. 

There is some empirical support for this assumption in the work of Ermestus 
(2000: 177). Based on a corpus of spontaneous (Standard Dutch) speech, Ernestus 
notes that 


“Clusters of fricatives of the same place of articulation arise when a word-final 
fricative is followed by a word-initial one. These clusters are generally realized with 
a duration that is shorter than the duration of two segments (...). In what follows, 
clusters consisting of two segments with the same manner and place of articulation 
will be referred to as geminates. 

(...) The problem is that fricative geminates are always realized as voiceless, in- 
dependently of their context, exact duration, etc.” 


A somewhat more complicated argument for the same relation between frication 
and length, finally, comes from a number of Brabantish and Flemish dialects of 
Dutch (De Schutter & Taeldeman 1986) in which the deletion of /t/ in clusters 
causes the fricative in such clusters to become devoiced. So, hij doe[t vJeel ‘he 
does a lot’, is realised as hij doe[f]eel. The same does not happen (or happens 
much less frequently) if the consonant which followed the /t/ in underlying form 
is a plosive. This could be analysed as a case of opaque interaction between pro- 
gressive assimilation — which is indeed a rule that applies in Dutch in clusters end- 
ing in fricatives — and t-deletion. On the assumption that voiceless fricatives are 
long, however, a different solution is also possible: deleting /t/ would leave a posi- 
tion to be filled up by the fricative, which would thereby become long. Devoicing 
would thus be a form of compensatory lengthening. 
Based on these arguments, we postulate the following correlation: 


(15) Ifa fricative is attached to one position, it is voiced, and vice versa. 


This might lead one to conclude that the phonological distinction between voi- 
ced and voiceless fricatives is one of length rather than of voicing. However, in 
the usual case fricatives devoice in Dutch just like stops. Devoicing is usually de- 
scribed as delinking of the feature [voice] or of the Laryngeal node (Lombardi 
1991, 1995, 1999). If we were to subscribe to a length theory of fricatives, we 
would need an alternative account — which would need to say that somehow frica- 
tives lengthen at the end of the syllable or at the end of the word. It is not immedi- 
ately clear that such an account can provide an explanation for the fact why the 
fricatives in first person singulars do not lengthen. 

The second problem seems even more severe. One of the most well-known 
aspects of Dutch phonology is that it has voicing assimilation in obstruent clus- 
ters. This assimilation process involves stops and fricatives alike. One example 
will suffice to show the problem: 
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(16) a/f/+/d/oen > al[vd]oen ‘take off’ 
a/f/+/t/akelen > alft]jakelen ‘go to seed’ 


In autosegmental terms, this change can easily be described in terms of a fea- 
ture [voice], spreading from the stop to the fricative. This then is a clear contrain- 
dication to the assumption that the distinction among fricatives is primarily one of 
length. 

Since there seem to be quite a few problems with the length-based account, we 
will now turn to an alternative account based on traditional features. Vaux (1998) 
argues in favour of the view that voiceless fricatives are represented as [+spread 
glottis] (like aspirated stops). The proposal is dubbed Vaux’s Law in Avery & Id- 
sardi (2001), and we will formulate it in the form of an implicational constraint: 


(17) VAUux’sLAw: Fricative > [spread glottis] 
‘Fricatives preferably have the feature [spread glottis]’ 


Vaux (1998) presents arguments from (several dialects of) Armenian, as well 
as from Sanskrit, Pali, the historical development of Modern Greek and from Thai 
for this implication. 

Some of the facts of Dutch discussed above might be amenable to an analysis 
along the same lines. For instance, the fact that fricatives seem more resistant to 
devoicing than stops can be understood, because voiced fricatives might be re- 
garded as (literally) more marked than voiceless ones, in the same sense that aspi- 
rated stops are more marked than unaspirated ones. Devoicing a fricative involves 
adding [+spread glottis], which is incompatible with an analysis in which final 
devoicing is an instance of delinking the Laryngeal node.'* On the other hand, we 
would obviously need an account of final devoicing that would regard it in some 
cases as a form of final fortition (such an account seems feasible, however; cf. 
Iverson and Salmons 2003a,b, 2006, forthcoming; Vaux & Samuels 2005; Kipar- 
sky 2006). 

Also relating to the proposed similarity in representation between voiceless 
fricatives and aspirated plosives, is that it is well-known that aspirated plosives 
are known to be substantially longer than unaspirated plosives. In this light, con- 
sider the proposal by Ringen (1999) of a constraint MULTILINK, which demands 
that [spread glottis] must be linked to two positions, capturing the length effects 
mentioned above: 


(18) MULTILINK 
The feature [+spread glottis] must be linked to two positions. 


The relation expressed by MULTILINK could be seen as a (mutual) enhance- 
ment of contrast of length and a laryngeal feature. Ringen uses this constraint to 
explain why underlyingly aspirated stops in Icelandic are not allowed to surface as 
aspirated when they occur in a cluster. In this case, they occur as ‘preaspirated’ 
stops, sharing the feature [spread glottis] with an [h]. The fact that in English on- 
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set clusters, aspiration spreads from the stop to the onset ([pl; ead, [tr; Jain, etc.) 
could be explained by invoking this constraint in a similar way. 

Extending this interpretation of MULTILINK, we could also use it to explain 
why voiceless fricatives are (preferably) long or in a cluster. It has indeed been 
proposed in the literature that a feature [tense] on fricatives is phonetically cued 
primarily by length (cf. Jessen 1998 for an overview; cf. also van Rooy & Wissing 
2001). To the extent that [tense] and [spread glottis] can be regarded as the same 
formal object, MULTILINK can be regarded as a formalisation of this idea. A short 
voiceless fricative prefers to share its [spread glottis] specification; it can do this 
either by being long (assuming the parts of the long fricative help each other to 
satisfy MULTILINK), or by occurring in a voiceless cluster. 

In order to account for the fact that Standard Dutch does not have aspirated 
(i.e. [spread glottis]) stops, we invoke the following constraint: 


(19) *NGO, NOGEMINATEONSETS: 
Stops in onsets are never long (no initial geminate stops) 


The constraint clearly has some typological value, since geminates are absent 
from onset positions more often than anywhere else. MULTILINK, together with 
VAUX’SLAW can help us to formulate the behaviour of intervocalic fricatives in a 
much more insightful way, as will be shown now. An interesting aspect of our 
current findings is that it allows us to understand the dual behaviour of voicing in 
fricatives: it behaves both as a length distinction and as a feature difference, be- 
cause it involves both. 

An important observation is that in one of the regions (the Twente region in 
the East, on the border with Germany) in which we find exceptions to final de- 
voicing as discussed here, we find aspirated stops in foot-initial onsets. This is 
important since it could undermine the basis of our account in which the two 
classes of obstruents are represented differently. If both fricatives and plosives are 
organized according to a [spread glottis] contrast rather than according to [voice], 
it is not clear why one would be an exception to final devoicing whereas the other 
could not. 

However, the observations made above about the difference in behaviour be- 
tween plosives and fricatives in intervocalic position also hold in these dialects 
(Schoemans & van Oostendorp 2004), so that it seems that in this position there 
still is a phonological length difference. Further, inspection of the data in the GTR 
database shows that the aspiration contrast is restricted to word- or foot-initial po- 
sition; in other positions we find a voicing contrast instead. Since we are analyz- 
ing the exceptions to final devoicing as special cases of ‘intervocalic’ fricatives, 
aspiration in these dialects pose no specific problem to the account proposed here. 


4. OT Formalisation 

In the preceding sections we have seen, first, that a structural account of the 
special behaviour of the first person singular seems more promising than a para- 
digmatic account, and, second, that a theory of voicing in fricatives which is based 
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on length is feasible although not unproblematic. We will now try to put the pie- 
ces together to see whether we can produce a coherent analysis that can deal with 
all of these facts at the same time. I have chosen OT as my framework of analysis, 
since it offers a fairly standard lingua franca in which the discussion can be fra- 
med. 

The core of the analysis are the constraints VAUX’SLAw, requiring fricatives to 
be [spread glottis] (‘voiceless’), and MULTILINK, requiring [spread glottis] to be 
spread across two positions. It is necessary, first, to show how these two con- 
straints can account for the behaviour of fricatives in intervocalic position, in in- 
teraction with a constraint on syllable well-formedness, to the effect that long 
consonants are not allowed after long vowels (recall the facts in (13-14) above; 
the constraint is referred to as *tuy here) and assuming that faithfulness con- 
straints are ranked conveniently (i.e. vowels are not allowed to change their 
length, but fricatives can change both their length and their voicing specifica- 
tion):"” 


(20) a./aisa:/, /a:sai/ *upu | —MULTILINK VAUx’s LAW 
we i. aizat | . 
ii. arsai | =} 
iii. arsiat ia! | 
b. /azia:/, /asia:/ 
me i. asia | 
ii. asa: | "I 
ili, azar | *| 


The input in (20a) has a long vowel preceding the fricative. The winning candi- 
date in (20ai) has a voiced fricative, but no alternatives are available with a voice- 
less fricative; (20aii) is a short voiceless fricative violating MULTILINK, and a long 
voiceless fricative can only be introduced here at the cost of introducing a super- 
heavy syllable in the middle of a word. 

Alternatively, the inputs in (20b) have a short vowel. The winning candidate 
can now satisfy all relevant constraints, since it can contain a long and voiceless 
fricative. Alternatives will always either have a short voiceless fricative (20bii) or 
a voiced fricative (20biii), violating markedness. 

In order to describe the behaviour of fricatives at the end of the word, we need 
to take a closer look at the actual structure of the word in that position. Dutch syl- 
lables are usually minimally and maximally bimoraic; trimoraic syllables are only 
found at the end of words. As a matter of fact, the end of word is even less restric- 
tive. Here, we even find extra (coronal) consonants beyond the template. We thus 
have words such as herfst ‘autumn’ where herf is a trimoraic syllable and st is a 
cluster of ‘extrasyllabic’ segments, which are outside the syllabic structure proper. 
I assume that these extra positions are also available for the second half of gemi- 
nates at the end of words: 
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(21) /ais/,/aiz/ *up | MULTILINK VAUX’s LAW 
wm asi | 

as *| 

az *| 

aiZi *! * 
/as/, /az/ *up | MULTILINK VAUxX’s LAW 

we asi 

as *| 

aZ ia 

aZzi aa! 


On the other hand, in the exceptional cases such as ik geleuv in (1) can be dealt 
with if we assume that (a) here the fricative appears in an onset of an empty- 
headed syllable (as was argued above), and (b) geminates are not allowed in an 
onset in this position: 


(22) /yalo:v/ * ULL *NGO MULTILINK VAUx’s LAW 
we yYo.lou.vV | | = 
ya.lorf.fV a | | 
yo.lo:.ffV | il | 
yolo:.fV | | * 


(22) gives a comparison of [yale:v] with all of the conceivable possible out- 
puts that have a voiceless consonants. The winning candidate violates VAUX’S 
LAw (since it does not have the feature [spread glottis]), but it beats all its com- 
petitors on some higher-ranking constraint. 

The difference between the dialects that do allow for this type of structure and 
those which do not can now be reduced to the question whether or not the dialect 
allows an empty vowel in this particular configuration. The empty vowel should 
be licensed by the 1st person singular morpheme. We can assume that the con- 
straint responsible for this is the following (cf. Kurisu 2001 and references cited 
there for related, although not completely identical proposals): 


(23) REALIZEMORPHEME 
A morpheme should somehow be expressed in the phonological surface 
representation. 


In the dialects which allow these exceptions, REALIZEMORPHEME is ranked 
above whatever constraints there are against empty vowels (*EMPTy). For dialects 
which do not allow for this possibility, there are two options. The least interesting 
option is that in these dialects the ranking is *EMPTY » REALIZEMORPHEME. 
Somewhat more interesting would be the proposal that the ranking does not 
change, but the first person singular suffix loses its status as an independent mor- 


94 MARC VAN OOSTENDORP 


pheme, so that REALIZEMORPHEME no longer plays a role and therefore does not 
license the empty vowel, which will no longer be postulated in the phonology. 

Notice, however, that we still lack a formal answer to the question of why 
stops do not display the same kind of behaviour as fricatives. The answer is that in 
this case the relevant property (voice) is not dependent on syllable positions di- 
rectly, and not interpreted in terms of length. 

Yet according to this definition, [voice] also cannot appear in the onset of oth- 
erwise empty syllables, since it is not followed there by a tautosyllabic sonorant. 
Consider the following tableau: 


(24) /baid/ ‘bathe’ ‘un | FD 
i. badV | *! 
zw ii. ba.tV | 
iii. bart *! | 
iv. ba:d | *| 


This tableau proves that obstruents will always be devoiced, regardless of the 
morphological structure. The forms in (24i) and (24iv) have a voiced obstruent 
which is not followed by a sonorant; the form in (24iii) has a superheavy syllable. 
(24ii) avoids the problem of a superheavy syllable by introducing an empty vo- 
wel, and the FD problem by making the obstruent preceding the empty vowel voi- 
celess. 


5. Conclusion 

In this article, I have shown that a sophisticated view of representations can 
provide us with insight in a phenomenon that seems simple at first sight, but 
which turns out to be quite problematic on closer inspection. The fact that excep- 
tions to final devoicing are only found in first person singular forms of verbs end- 
ing in a (long vowel plus) fricative may seem almost trivial at first sight, but I 
hope to have shown that at present it seems to shed light on at least two different 
debates in linguistic theory: morphology-phonology interaction, and phonological 
representations. 

On the one hand, this phenomenon can be most satisfyingly accounted for in a 
theory which does not rely so much on paradigm uniformity as on one which pos- 
tulates a somewhat abstract morpheme for the 1sc. Notice that this analysis can 
also be seen as an argument in favour of (Some amount of) phonological structure; 
it does not work without being able to refer to the syllabic position ‘onset’. 

Similarly, the reason why fricatives behave differently from stops required ex- 
planation, and preferably one which links this particular difference between frica- 
tives and stops to other differences, such as that in assimilation in clusters. Again, 
this could be attained by studying the representations we need more closely. This 
paper therefore will hopefully provide a further impetus to the revived interest in 
representational issues in phonology. 
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Notes 

1. There are many differences between these languages; for instance, it is well-known that some 
varieties of Yiddish do not (or do no longer) devoice obstruents at the end of the word (Lombardi 
1991, 1995; Wetzels & Mascar6é 2001) and it is claimed for Frisian that FD did not start to operate 
until the beginning of the 20th century (Tiersma 1985). See van Bree (2003) for an overview. 


2. Schoemans & van Oostendorp (2004) present a larger set of data. 


3. I will not discuss the different behaviour of distinct places of articulation; notice that our find- 
ings correspond in part to a hierarchy according to which velars in general are more prone to de- 
voicing than coronals, which in turn devoice more easily than labials. 


4. This database is available online at http://www.meertens.knaw.nl/projecten/mand/data.html. 


5. Zonneveld (1978) is the Urheber of this idea in the Dutch literature, albeit for facts which are 
quite different from the ones studied here; see van Oostendorp (2005) for a comparison; cf. also 
Kaye, Lowenstamm & Vergnaud (1990) and many others, for theoretical proposals regarding the 
nature of such empty positions. 


6. Note that in the north-east, on the border of the IJsselmeer (Issel lake), there are a few circles 
which are not close to dialects were schwa is pronounced. It is possible that schwa deletion has 
started to operate relatively recently in this region. Another possible reason is that these are so- 
called West-Frisian dialects, and very similar to Frisian in many ways. As was noted in footnote 1, 
final devoicing did not apply to Frisian for a very long time (the province of Fryslan borders on the 
West-Frisian area, but data from this language have not been included in the survey on which this 
map was based), which explains the white spot on the province of Fryslan (the province in the 
north, below the three rightmost islands). Potentially, then, these dialects are indeed on the border 
of a linguistic area — albeit one at the opposite side of the lake. 


7. A complicating factor is that the subject is always a second person singular pronoun or clitic; in 
some dialects this is an (underlyingly voiced) fricative, and fricative clusters are never voiced in 
Dutch (cf. Zonneveld, this volume). Another problematic case is where the pronoun (and espe- 
cially the clitic) starts with a vowel; in that case lack of devoicing can be understood as resyllabifi- 
cation. In many dialects, however, the second person pronouns and clitics start with a glide; in this 
case there should be no problem in voicing the fricative. 


8. A reviewer points out that “many varieties of Netherlandic — not just dialects, but also colloquial 
varieties — are losing voicing distinctions in fricatives. In the midst of such a change, it becomes 
somewhat less surprising that the pattern treated here would arise — fricative devoicing is part of a 
broader pattern.” It is indeed true that many varieties of Dutch seem to be losing the voicing dis- 
tinction in fricatives; notice that this means that laryngeal faithfulness is apparently less important 
for fricatives than for stops. 


9. “niet alle in aanmerking komende klanken [komen] tegelijk aan de beurt [...]: bij de fricatieven 
vindt er duidelijk eerder verscherping plaats dan bij de occlusieven [...]; dat kan er verband mee 
houden dat de ongemarkeerde toestand waarin een fricatief zich bevindt, die van stemloosheid is” 
(van Bree 2003: 7; my translation MvO). 


10. It is well-known that languages of the world may have (especially labiodental) segments that 
are difficult to classify as either fricatives or approximants. See e.g. Padgett (2002) for Russian 
and Hamann & Sennema (2005) for German, as well as references cited there for, for instance, 
other Slavic languages and Hungarian. 


11. Rice (1993) and Avery (1996) give these as an example of ‘sonorant obstruents’: the voicing of 
fricatives is a result of a feature (non-laryngeal) Sonorant Voice, but the stops are voiced by laryn- 
geal [voice] and the final devoicing rule targets only the latter. This does not explain, however, 
why the asymmetry is exactly in this way (it seems to be similar in many of Rice’ (1993) exam- 
ples; to be more precise, there is no example where stops have Sonorant Voice, but fricatives have 
[voice]). 


96 MARC VAN OOSTENDORP 


12. See Avery (1996) and Iverson & Salmons (2003a,b, 2006, forthcoming) for related positions. 
Kraehenmann (2003) offers an extensive treatment of the fortis/lenis distinction in Alemannic in 
terms of length. 


13. See van Oostendorp (2002) for a full analysis. 


14. A reviewer points out that it would be possible to regard devoicing as delinking of [voice] and 
subsequent addition of [spread glottis] as a type of enhancement. Such an analysis would be most 
straightforward if not the whole Laryngeal node is delinked. 


15. In the following tables, there is sometimes more than one input representation in the lefthand 
table: these will surface in the same way due to the irrelevance of most faithfulness constraints. 


16. If we assume that neither vowels nor fricatives can change in any way, the resulting language 
will be one in which voicing (or length) of fricatives is not dependent on syllabification; but if 
either voicing or length of fricatives can change, or if the vowels can change, we will obtain a pat- 
tern resembling the pattern established here (albeit in some cases one where all contrasts are neu- 
tralized). 
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Prevoicing in Dutch Initial Plosives 
Production, Perception, and Word Recognition 


Petra M. van Alphen 
Max-Planck Institute for Psycholinguistics 


Prevoicing is the presence of vocal fold vibration during the closure of initial voiced 
plosives (negative VOT). The presence or absence of prevoicing is generally used to 
describe the voicing distinction in Dutch initial plosives. However, a phonetic study 
showed that prevoicing is frequently absent in Dutch. This article discusses the role 
of prevoicing in the production and perception of Dutch plosives. Furthermore, two 
cross-modal priming experiments are presented that examined the effect of prevoic- 
ing variation on word recognition. Both experiments showed no difference between 
primes with 12, 6 or 0 periods of prevoicing, even though a third experiment indi- 
cated that listeners could discriminate these words. These results are discussed in 
light of another priming experiment that did show an effect of the absence of 
prevoicing, but only when primes had a voiceless word competitor. Phonetic detail 
appears to influence lexical access only when it helps to distinguish between lexical 
candidates. 


1. Introduction 

This article focuses on the phonological voicing distinction in Dutch initial 
plosives, that is, the distinction between [+voice] and [—voice] in those plosives. 
Although most languages contrast these two phonemic classes (which I will refer 
to as voiced and voiceless plosives), the way in which this phonological distinc- 
tion is implemented phonetically varies across languages. Lisker & Abramson 
(1964) investigated eleven languages and measured the time between the release 
of a plosive and the onset of vocal fold vibration, which they referred to as Voice 
Onset Time (VOT). They established that, across languages, VOT is essentially 
tri-modal. The three categories based on VOT were: plosives with a negative 
VOT, produced with a voiced lead (i.e., with voicing during the closure); plosives 
with a slightly positive VOT, produced with almost no aspiration; and plosives 
with a clear positive VOT, produced with aspiration (see also Keating 1984). 

Some languages, such as Thai, employ all three modes in a three-way voicing 
distinction. Most languages, however, have a two-way voicing distinction, which 
is implemented by two adjacent modes, one of which is associated with the 
voiced, and the other with the voiceless plosive. Keating, Linker & Huffman 
(1983) surveyed 51 languages and observed that almost all these languages use at 
least some kind of voiceless unaspirated plosive and that of the two categories 
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contrasting the voiceless unaspirated plosive, fully voiced and voiceless aspirated 
plosives are about equally common. 

Germanic languages such as Danish, English and German contrast voiceless 
unaspirated and voiceless aspirated plosives in initial position (Keating 1984). 
Dutch, however, is unusual among Germanic languages in that it does not include 
this contrast. Instead, Dutch, along with other languages such as Arabic, Bul- 
garian, French, Japanese, Polish, Russian and Spanish, has a traditional voicing 
contrast (Keating 1984, Lisker & Abramson 1964). That is, the voiced plosives 
are produced with a voice lead, which I will refer to as prevoicing, and the voice- 
less plosives are produced with little or no aspiration. Figure 1 shows an example 
of a voiced plosive with prevoicing and an example of a voiceless plosive. There 
are three plosives in Dutch which belong to the voiceless category, namely [p], [t], 
and [k], while there are only two plosives which belong to the voiced category, 
namely [b] and [d]. The velar voiced plosive [g] only occurs in loanwords and is 
therefore not discussed here. 

In this article I will first describe the production of prevoicing and discuss the 
outcomes of a study by van Alphen & Smits (2004), which investigated the occur- 
rence of prevoicing in Dutch initial voiced plosives and the role of prevoicing in 
perception. The outcomes show an interesting paradox between production and 
perception: prevoicing appears to be frequently absent in Dutch initial voiced plo- 
sives, but the presence of prevoicing is nevertheless a very strong cue for the per- 
ception of voicing in these plosives. In order to fully understand the influence that 
prevoicing has on perception it is important not only to study phoneme percep- 
tion, but also to study word recognition. Words are after all the meaningful units 
which a listener has to recognize in order to retrieve the message of the speaker. 
Two priming experiments and a discrimination experiment will be presented 
which investigate the effects of two types of prevoicing variation on word recog- 
nition. The results of these experiments will be discussed together with the out- 
comes of a priming study by van Alphen & McQueen (2006) in which the influ- 
ence of lexical word competitors starting with a voiceless plosive was examined. 
These experiments lead to the conclusion that word recognition is sensitive to 
prevoicing variation, but only to the type of variation that is relevant for the dis- 
tinction between lexical candidates. 


2. Production of prevoicing 

Prevoicing refers to the presence of vocal fold vibration during the closure of 
the plosive. According to the myoelastic-aerodynamic theory of phonation (van 
den Berg 1958), the vocal folds will vibrate only when they are properly adducted 
and tensed, and when there exists a sufficient transglottal pressure gradient to re- 
sult in a positive airflow through the glottis from the lungs. When a vowel or con- 
tinuant consonant is produced it is not very difficult to obtain sufficient trans- 
glottal pressure: the vocal tract is open and therefore the supraglottal pressure will 
be lower than the subglottal pressure as long as there is sufficient air in the lungs. 


PREVOICING IN PLOSIVES 101 


o 
~g 
B 
ra 

& 
< 

prevoicing | 
(-VOT) release 
Time 

o 

yg 

2 +VOT 

‘ [| 

release 
Time 


Figure 1: Waveforms of the initial voiced plosive and part of the vowel of the Dutch word /bo.t/ 
(upper panel) and of the initial voiceless plosive and part of the vowel of the 
Dutch word /po.t/ (lower panel). 


During the production of a plosive, however, all out-going airways are closed. 
The blocking of out-flowing air, causes the supraglottal pressure to increase rap- 
idly, which results in a rapid decrease of the transglottal pressure. It is therefore 
relatively difficult to let the vocal folds vibrate during the closure. As the volume 
above the glottis increases, the build up of supraglottal pressure is delayed to 
some extent, and therefore a sufficient transglottal pressure for voicing may be 
obtained for some period of time. Enlargement of the supraglottal cavity will thus 
help to initiate and maintain voicing. This enlargement can be obtained actively, 
by lowering the larynx, raising the soft palate, advancing the tongue root, or draw- 
ing the tongue dorsum and blade down (see Westbury 1982), or passively when 
the walls of the supraglottal cavity are lax which allows them to expand in re- 
sponse to the internal pressure (Rothenberg 1968). 

Children acquire the production of prevoicing relatively late (Kewley-Port & 
Preston 1974; see also the article by Kager et al., this volume). This also suggests 
that prevoicing production is relatively difficult. Nevertheless, studies on the pro- 
duction of prevoicing in languages such as Polish (Keating, Mikos & Ganong 
1981), Lebanese Arabic (Yeni-Komshian, Caramazza & Preston 1977) and Euro- 
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pean French (Caramazza & Yeni-Komshian 1974) show that adult speakers rarely 
omit prevoicing when producing voiced plosives. Only one study, on Canadian 
French (Caramazza & Yeni-Komshian 1974) has found a substantial degree of 
overlap between the VOT distributions of voiced and voiceless plosives; no less 
than 58% of the voiced tokens in that sample (N=90) were produced without 
prevoicing. Caramazza and Yeni-Komshian argued that in Canadian French the 
VOT values are shifting as a result of the influence of Canadian English. 

Until recently, however, Dutch has not been systematically investigated in 
terms of occurrence of prevoicing in the production of voiced plosives. It is im- 
portant to know how prevoicing varies naturally in order to understand effects of 
prevoicing variation on speech perception. The way in which the speech rec- 
ognition system treats a particular acoustic property largely depends on the varia- 
tion in the occurrence of this property in natural speech. In one of my recent stud- 
ies (van Alphen & Smits 2004) the occurrence of prevoicing in Dutch was there- 
fore investigated. 

Van Alphen & Smits (2004) asked ten Dutch speakers to produce 32 real 
words and 32 nonsense words with initial voiced plosives (/b/ or /d/). These items 
were presented randomly in a list with fillers (including the same items starting 
with voiceless plosives), such that the listener’s attention was not drawn to the 
voicing distinction. The results showed that 25% of the tokens with initial voiced 
plosives were produced without prevoicing. The proportion of prevoiced tokens 
was found to be influenced by the following factors: sex of the speaker (male or 
female), place of articulation of the plosive (labial or alveolar), and the phoneme 
following the plosive (vowel or consonant). All these factors might have an effect 
on the vocal tract volume or on the extent to which the vocal tract can be ex- 
panded. The smaller the volume of the vocal tract, the faster the supraglottal pres- 
sure increases and the more difficult it is to produce prevoicing. Male speakers are 
expected to have a larger vocal tract size than female speakers, which makes it 
easier for males to produce prevoicing. In line with this expectation, male speak- 
ers produced prevoicing more often than female speakers (86% versus 65%). The 
place of articulation of a plosive was expected to influence the extent to which the 
vocal tract can be expanded passively due to variable size of the oral cavity be- 
hind the point of constriction. For dental stops, the pharyngeal walls and part of 
the soft palate can yield to expansion of the oral cavity, while for labial stops 
these surfaces plus all of the tongue surface and parts of the cheek can participate 
in the expansion (Houde 1968, Rothenberg 1968). The oral cavity can thus be ex- 
panded more during the production of labial plosives than during the production 
of dental plosives. Van Alphen and Smits indeed found that labial plosives were 
more often produced with prevoicing than alveolars (79% versus 72%). Finally, 
the following phoneme was expected to affect the vocal tract size and the extent to 
which the different mechanisms (passively and actively) could expand the vocal 
tract size, and thus the proportion of prevoiced tokens. No effect of vowel height 
was found, but plosives followed by a vowel were more often prevoiced than plo- 
sives followed by a consonant (86% versus 65%). 
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Although it seems that prevoicing was absent in the cases where the aerody- 
namics made it harder to produce prevoicing, it can not be the case that prevoicing 
is simply too difficult to produce in particular cases, since other studies on 
prevoicing production in other languages (e.g., Keating, Mikos & Ganong 1981, 
Yeni-Komshian, Caramazza & Preston 1977, and Caramazza & Yeni-Komshian 
1974) did not find such a large proportion of unprevoiced tokens. This suggests 
that Dutch speakers make less effort to produce prevoicing, resulting in a rela- 
tively large proportion of voiced plosives without prevoicing, especially in the 
cases in which it is difficult to produce prevoicing. We can only speculate about 
the reason for this. It may be the case that the way in which the voicing distinction 
in Dutch is implemented phonetically is changing as a result of the influence of 
English on the Dutch language. 


3. The role of prevoicing in the perception of the voicing distinction 

Now that we know that prevoicing is frequently absent in Dutch initial voiced 
plosives, we can ask what influence this has on perception. Are the voiced tokens 
produced without prevoicing still perceived as voiced? In other words, is the pro- 
duction of prevoicing essential for the plosives to be perceived as voiced, or are 
other acoustic cues present and strong enough to evoke a voiced percept? We 
know from the previous literature that VOT is not the only acoustic property 
which covaries with the voicing distinction in plosives (see for example Jessen 
1998 for German; Slis & Cohen 1969 for Dutch). Van Alphen & Smits (2004) 
therefore examined what other acoustic properties were present in the acoustic 
realizations of Dutch initial plosives which could serve as potential perceptual 
cues to the voicing distinction. The following six measures were obtained from a 
sample of 480 voiced tokens and 480 voiceless tokens: duration of prevoicing, 
duration of the burst, power of the burst, spectral centre of gravity of the burst, Fo 
immediately after burst offset, and Fo movement into the vowel (see van Alphen 
& Smits for a detailed description of the measurements). Except for the Fo imme- 
diately after burst offset, all measures showed a significant difference between 
phonologically voiced and voiceless plosives (that is, tokens which were intended 
to be voiced or voiceless by the speaker). In addition to the finding that voiced 
plosives had more prevoicing than voiceless plosives (which were never produced 
with prevoicing) the data showed that all three measures involving the burst (the 
duration, power and spectral centre of gravity) were lower for voiced than for 
voiceless plosives. Finally, the mean Fo difference (that is, the difference between 
the Fo in the middle of the following vowel and the Fp immediately after the burst) 
was positive for tokens starting with voiced plosives, consistent with a rising Fo, 
while it was negative for tokens starting with a voiceless plosive, consistent with a 
falling Fo. These differences indicate that the speech signal contains a variety of 
potential perceptual cues for the voicing distinction. 

Sixteen listeners were then asked to identify the 960 tokens as voiced or voice- 
less. Regression tree analysis of the responses indicated that, of all measured 
acoustic properties, the presence or absence of prevoicing was by far the strongest 
cue to the voicing distinction as perceived by listeners. All tokens produced with 


104 PETRA M. VAN ALPHEN 


prevoicing were perceived as voiced. Tokens without prevoicing, however, were 
perceived either as voiced or voiceless. In those cases, the perceived voicing cate- 
gory depended on the value of the other acoustic cues in the signal. When those 
cues were in favour of the voiced category, the tokens were perceived as voiced, 
despite the absence of prevoicing. The acoustic cue which most strongly influ- 
enced listeners’ responses to tokens without prevoicing was different for the two 
places of articulation. The perception of voicing in labial plosives was influenced 
most strongly by the Fo difference from the burst of the plosive into the vowel: a 
higher Fo difference (that is, a clearly rising Fp pattern) yielded a higher propor- 
tion of voiced responses. The perception of voicing in alveolar plosives appeared 
to be influenced most strongly by the spectral centre of gravity of the spectral 
noise of the burst: a higher spectral centre of gravity yielded a lower proportion of 
voiced responses. Nevertheless, these secondary cues were rather weak in com- 
parison to prevoicing. Of all tokens produced without prevoicing which were in- 
tended to be voiced, 37% were identified as voiceless. The absence of prevoicing 
clearly decreases the probability that a token is perceived as voiced. 

The results of the study by van Alphen & Smits (2004) indicate that the pres- 
ence or absence of prevoicing plays an important role in the phonetic realization 
and the perception of the phonological voicing distinction in Dutch initial plo- 
sives. The role of prevoicing is, however asymmetric: voiceless plosives are al- 
ways produced without prevoicing, while voiced plosives are not always produced 
with prevoicing. In line with this, tokens produced with prevoicing are always 
perceived as voiced, while tokens produced without prevoicing are not always 
perceived as voiceless. 

So far, I have argued that prevoicing has a strong influence on the identifi- 
cation of Dutch initial plosives as voiced or voiceless. Of course, speech percep- 
tion involves more than the perception of phonological features or the perception 
of single phonemes. The core process in speech perception is the recognition of 
words. Since words are the units which convey meaning, the recognition of words 
is an essential component of how the listener retrieves the message of the speaker. 
Thus, the next step one has to take in order to fully understand the effect of 
prevoicing variation on speech perception is to examine the influence of pre- 
voicing variation on the recognition of words. 


4. Effects of fine-grained acoustic details on word recognition 

Word recognition involves the mapping of the speech signal onto stored lexi- 
cal knowledge. As the utterance unfolds over time, multiple lexical candidates are 
activated as a result of the acoustic input. The activation of a lexical candidate at a 
particular moment in time reflects the goodness of fit with the available acoustic 
input at that moment. The candidate that eventually matches the acoustic input 
best will be recognized. It appears that the activated lexical candidates compete 
with each other for recognition; the most strongly activated candidate will sup- 
press the activation of the other lexical candidates and win the competition (see 
McQueen 2004 for an overview of the evidence for the existence of competition 
between lexical candidates). 
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The speech signal is highly variable, however, and not all acoustic information 
is relevant for the recognition of words. Therefore, the assumption is that listeners 
perform a detailed phonetic analysis of the acoustic input prior to lexical access. 
At the prelexical level, the incoming speech signal is normalized and useful in- 
formation is extracted from the speech signal and translated into abstract represen- 
tations. Many different units have been proposed as prelexical representations, 
including syllables (Mehler 1981), semi-syllables (Massaro 1987), phonemes 
(Foss & Blank 1980; Nearey 2001), allophones (Luce, Goldinger, Auer & Vite- 
vitch 2000) and features (Stevens 2002). So far, research has not provided us with 
conclusive evidence singling out one of these units. 

These prelexical representations, whatever their exact nature, are assumed to 
activate word representations. The prelexical level acts thus as an intermediate 
level at which the speech signal is analyzed and filtered. On this account, it is im- 
portant to distinguish between acoustic detail that is normalized away at the prel- 
exical level and that which is passed on to the lexical level. It could be the case 
that at the prelexical level discrete decisions are made (for example in the FUL 
model by Lahiri & Reetz (1999) in which a phoneme is either [+voice] or 
[—voice]), and that most acoustic detail is thus normalized away. In contrast, it 
could also be the case that the prelexical level preserves part of the acoustic detail 
such that the output of the prelexical level is graded (for example in the recent 
version of Shortlist by Norris, McQueen & Cutler (2000), which involves graded 
activation of prelexical representations.) If the latter assumption is correct, one 
particular token of the labial plosive may result in more activation of the prelexi- 
cal representation for [b] or for [+voice] than another token does. In other words, 
the question is how much acoustic detail is still present in the information that 
reaches the lexical level. 

Many studies have shown that lexical activation is in fact sensitive to fine- 
grained acoustic information (see McQueen, Dahan & Cutler 2003, for a detailed 
overview). These studies show that the degree of activation of lexical candidates 
is influenced by fine-grained differences in the speech signal and thus suggest that 
small acoustic details are preserved by the prelexical level and can reach the lexi- 
con. In other words, they challenge the view that discrete decisions (for example 
phonemic decisions) are made at the prelexical level. It seems that graded activa- 
tion of prelexical representations is passed on continuously to the lexical level. 
This is in line with spoken-word recognition models such as TRACE (McClelland 
& Elman 1986) and SHORTLIST (Norris 1994, Norris, McQueen & Cutler 2000), 
in which information flows continuously from a prelexical level of processing to 
the lexical level. 

Among the studies which report effects of fine-grained acoustic details on 
lexical activation, there are a number of studies which focus on variation in VOT. 
Andruski, Blumstein & Burton (1994) obtained variations in English VOT by re- 
moving one third or two thirds of the original positive VOT of voiceless plosives 
which appeared word initially. They examined the influence of these VOT varia- 
tions on the activation of lexical candidates in a within modality associative prim- 
ing experiment. In this experiment, listeners were asked to perform a lexical deci- 
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sion task on spoken targets which were preceded by spoken primes. A target 
word, for example queen, was preceded by either a semantically unrelated prime, 
such as bell, or by a semantically related prime, such as king. All related primes 
started with a voiceless plosive and appeared in three different VOT conditions: 
with unaltered VOT, with two thirds of the original VOT, or with one third of the 
original VOT. Furthermore, half of the related primes were words which had a 
voiced word competitor, that is, changing the initial voiceless plosive into the 
matching voiced plosive resulted in a word, for example pear (bear is also an ex- 
isting word). The other half of the primes were words which did not have a lexical 
competitor, for example king (ging is not an existing word). The reaction time 
(RT) patterns of the lexical decisions showed that listeners were faster to make 
lexical decisions to targets when they were preceded by related primes than when 
they were preceded by unrelated primes. Lexical decisions to targets preceded by 
the primes with one third of the original VOT were significantly slower than lexi- 
cal decisions to the same targets preceded by primes with unaltered VOT. The 
presence of a voiced word competitor seemed not to influence these effects. Fur- 
thermore, these effects of VOT manipulation only appeared when the delay be- 
tween the offset of the target and onset of the prime was short (50 ms); they did 
not appear when the delay was longer (250 ms). 

Utman, Blumstein & Burton (2000) explored the influence of similar VOT dif- 
ferences on lexical activation using a uni-modal identity priming experiment. This 
time, both words and non-words starting with voiceless plosives were used as 
primes. Spoken targets were preceded by the same natural tokens of those targets, 
or by tokens in which the VOT was shortened. The findings for the word primes 
were consistent with the findings by Andruski et al. (1994): lexical decisions to 
spoken word targets, such as kiss, were slower when these targets were preceded 
by spoken primes, such as kiss, of which only one third of the original VOT was 
preserved, than when these targets were preceded by primes which were identical 
(with unaltered VOT). When targets and primes were non-words, however, no 
effect of the VOT reduction was found on the lexical decisions. 

McMurray, Tanenhaus & Aslin (2002) investigated the effect of VOT varia- 
tion on lexical access in English. The outcomes of their eye-tracking experiment 
showed that the mean proportion of fixations to two target pictures of a beach and 
a peach varied gradually as a function of the VOT of the initial plosives. 

These experiments show that differences in English positive VOT are not 
normalized away at the prelexical level, but that this type of acoustic detail is 
passed on to the lexical level where it can affect the degree of lexical activation. 
Can similar effects be observed for differences in the negative VOT of initial plo- 
sives in Dutch? This question was addressed in the priming experiments presented 
below. In order to understand the predictions which were made for Dutch, how- 
ever, it is important to first consider the differences between VOT in English and 
Dutch. 

Although in both English and Dutch VOT plays an important role in the pho- 
nological voicing distinction of word-initial plosives, the phonetic realization of 
voiced and voiceless plosives is rather different in the two languages. While in 
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English the informative value of VOT lies in the positive VOT range, that is, in 
the exact duration of aspiration, in Dutch it is the presence or absence of pre- 
voicing which seems to be important (van Alphen & Smits 2004). In English, the 
phoneme boundary between voiced and voiceless plosives in terms of VOT is not 
fixed, but varies on a continuous scale, for example as a function of speaking rate 
(Green & Miller 1985, Summerfield 1981). English listeners are therefore re- 
quired to make fine temporal distinctions along the VOT dimension in order to 
perceive the plosive as voiced or voiceless. In contrast, Dutch listeners do not 
need to establish the exact duration of the VOT to perceive the voicing distinction, 
since, as described above, the voicing distinction in Dutch is signalled by the 
presence or absence of prevoicing, rather than by the exact amount of prevoicing. 
Similar suggestions have been made by Keating, Mikos & Ganong (1981) about 
the comparison between the informational value of VOT variation for English 
versus Polish listeners. 

Given these differences between VOT in English and Dutch, the first pre- 
diction is that as long as Dutch initial plosives have prevoicing, differences in the 
exact amount of prevoicing will not affect lexical activation. After all, the exact 
duration of prevoicing will not help listeners to distinguish between two alterna- 
tive lexical candidates such as beer (bear) or peer (pear). Therefore, this type of 
uninformative acoustic detail should be normalized away at the prelexical level. 
As a result, shortening prevoicing duration should not affect lexical activation. 
The difference between the presence or absence of prevoicing, however, does 
carry information about the Dutch voicing distinction. Recall that van Alphen and 
Smits (2004) showed that the absence of prevoicing decreased the probability that 
that token was voiced. Therefore, the second prediction is that the deletion of 
prevoicing would affect lexical access. 


5. Experiment 1 

To test these predictions, three prevoicing values were chosen (0, 6 and 12 pe- 
riods of prevoicing) such that the smallest duration was zero and such that the 
physical difference between the subsequent prevoicing durations was the same. 
Importantly, all three degrees of prevoicing fell within the natural range of pre- 
voicing duration as established by van Alphen and Smits (2004). The expectation 
was to find an effect of the difference between the absence and presence of pre- 
voicing (0 versus 6 periods of prevoicing), but not of prevoicing shortening (12 
versus 6 periods of prevoicing). Furthermore, the experiments explored whether a 
possible effect of prevoicing differences would be influenced by the frequency of 
the prime words. High frequency words are usually recognized faster than low 
frequency words (e.g., Solomon & Postman 1952). Therefore, it may be the case 
that listeners are less sensitive to fine-grained acoustic variation in high frequency 
words than in low frequency words. 

Following Andruski et al. (1994), the associative priming task was chosen. But 
primes and targets were presented in different modalities (spoken primes were fol- 
lowed by visual targets), rather than within one modality. The reasons for this 
were twofold. First, Andruski et al. only observed effects when the delay between 
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the offset of the prime and onset of the target was short (50 ms). Since the VOT 
manipulation in Dutch appeared even earlier in the prime word than in English 
(prevoicing appears at the beginning of the plosive while aspiration appears after 
the burst of the plosive), it seemed preferable to present the targets immediately 
after the offset of the spoken prime. If both prime and target were presented audi- 
torily with zero delay, the prime could mask the end of the target. The cross- 
modal version of the associative priming task avoids this problem. Second, the use 
of a visual target ensured that what was tested was activation at the lexical level, 
rather than activation at the prelexical level as a result of possible phonological 
overlap between prime and target. Although most primes and targets did not over- 
lap phonologically (for example, bloem ‘flower’ — roos ‘rose’) some of the primes 
and targets did (for example, brood ‘bread’ — boter ‘butter’). 

The underlying idea in the use of the associative priming task is that the proc- 
essing of a stimulus (the prime) may facilitate the subsequent processing of a fol- 
lowing stimulus (the target) if the prime is semantically related to the prime. To 
measure the influence of the presentation of the prime on the processing of the 
target, participants are asked to perform a task such as lexical decision on the tar- 
gets. The RTs of these decisions are then compared to the RTs in a baseline condi- 
tion, in which the target is preceded by a semantically unrelated prime (see, for 
example, Marslen-Wilson & Zwitserlood 1989). 

If it is indeed the case that deletion of prevoicing affects lexical activation 
while differences in the amount of prevoicing do not, the following patterns 
should be observed: faster lexical decisions should be made to targets such as roos 
(‘rose’) when the preceding semantically related prime bloem (‘flower’) starts 
with prevoicing than when the same prime has no prevoicing; and there should be 
no difference between lexical decisions to targets preceded by related primes with 
12 periods of prevoicing and those to targets preceded by primes with 6 periods of 
prevoicing. If it is the case, however, that the prelexical level does not normalize 
away the difference in prevoicing duration (12 versus 6 periods of prevoicing) 
such that this type of variation does affect lexical activation, different priming ef- 
fects should be found for primes with 12 and 6 periods of prevoicing. The expec- 
tation would then be that primes such as bloem starting with 12 periods of 
prevoicing will result in stronger activation of the lexical representations of those 
words (e.g., the lexical representation of bloem) than the same primes starting 
with 6 periods with prevoicing, since plosives with 12 periods of prevoicing are 
further away from the phoneme boundary. 


5.1 Method 

Participants 

Forty-eight students were paid to take part in the experiment. None of them re- 
ported any hearing loss. 


Materials 
Two types of words were selected as primes: 40 high frequency words (HF 
words), and 40 low frequency words (LF words). The mean frequency of the HF 
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words was 97 per million words and the mean frequency of the LF words was 2 
per million words (from the CELEX lexical database, Baayen, Piepenbrock & 
Gulikers 1995). Half of the HF words started with a /b/ and the other half started 
with a /d/. Of the LF words, 25 words started with a /b/ and 15 with a /d/. All 
words were mono- or disyllabic; the disyllabic words all had a strong-weak stress 
pattern. 

For each word a semantically related word was selected to serve as a visual 
target. This was done by asking 23 subjects to give their associations for each 
word. An associated word was regarded as a good target when the word was given 
in response by more than 25% of the subjects and when the difference between 
that associated word and the next most frequent associated word was greater than 
10%. The mean frequency of the targets associated with the HF prime words was 
133 per million words and the mean frequency of the targets associated with the 
LF prime words was 42 per million words. For each target an unrelated prime was 
also chosen which matched the related prime in length and started with the same 
phoneme (/b/ or /d/). In addition to the 80 word targets there were 40 non-word 
targets preceded by (unrelated) primes starting with a voiced plosive (half of them 
started with a /b/ and half of them with a /d/). Furthermore 200 other targets were 
paired with primes that started with a phoneme other than a /b/ or /d/: 120 non- 
word targets with unrelated primes; and 80 word targets, of which 20 were pre- 
ceded by a related prime and 60 by an unrelated prime. The design is summarized 
in Table 1. 


Stimulus construction 

All primes were recorded several times on digital audio tape (at a sampling rate of 
48 kHz with 16-bit resolution) by a male native speaker of Dutch. The utterances 
were then digitized at a sample rate of 16 kHz. For the three prevoicing priming 
conditions and the unrelated condition, tokens were chosen which were produced 
clearly and with prevoicing. Subsequently, the original prevoicing of each related 
prevoiced item was replaced by 12, 6 or 0 periods of prevoicing (corresponding to 
129, 64 or O ms of prevoicing for /b/ and to 127, 62 and O ms of prevoicing for 
/d/), in order to create the three different prevoicing conditions. 

The first full period of prevoicing plus the lead-in (of 5 ms) of a natural token 
of the word bus /bwus/ (‘bus’) was chosen as the first period of prevoicing for the 
two conditions with prevoicing for the items starting with a labial plosive. Simi- 
larly, the last prevoicing period of that same token of /bus/ always served as the 
last prevoicing period in these two conditions. The intervening prevoicing periods 
(10 or 4) were randomly chosen from the /bus/ token. The same procedure was 
applied to create the prevoicing 12 and prevoicing 6 conditions for the items start- 
ing with an alveolar plosive, but now the prevoicing periods were derived from a 
natural token of the word dus /dus/ (‘thus’). To control for any splicing effects, 
the prevoicing of each of the unrelated primes was also replaced by six periods of 
prevoicing. 
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HF words LF words Nonwords 
(Exp 2 only) 

PRIME Prevoicing 12 —bloem ‘flower’ beits ‘stain’ breld 

Prevoicing 6 bloem ‘flower’ beits ‘stain’ breld 

Prevoicing 0 bloem ‘flower’ beits ‘stain’ breld 

Unrelated baan ‘job’ broche ‘brooch’ biem 
TARGET Experiment 1 roos ‘rose’ verf ‘paint’ - 

(associative) 

Experiment 2. _—bloem ‘flower’ beits ‘stain’ breld 

(identity) 


Table 1: Design of Experiments 1 (associative priming) and 2 (identity priming). For each 
combination of priming condition and prime frequency (including nonword primes) examples of a 
prime and target are given. Real words have their English translation in parentheses. 


Procedure 

Primes were presented binaurally over headphones in a sound-damped booth. 
Immediately after the offset of the prime the visual target was presented in lower 
case on a computer screen. Subjects were instructed to listen to the word and then 
decide as quickly as possible whether the stimulus on the screen was a word or a 
non-word, by pressing one of two buttons. Four lists were constructed with prim- 
ing condition counterbalanced across lists. Each subject therefore saw each target 
only once, preceded by one of the four possible primes for that item. Furthermore, 
the lists contained all fillers such that half of the targets were words and the other 
half non-words. Of the total of 320 pairs in a given list, 80 pairs (25%) were re- 
lated. 

After the associative priming experiment, all test items that were used as re- 
lated primes were presented to the same listeners for identification of the initial 
phoneme. In addition to the 240 word tokens starting with a voiced plosive (/b/ or 
/d/), the identification task contained three repetitions of 80 distractor words start- 
ing with a voiceless plosive (/p/ or /t/). The items were blocked by place of articu- 
lation. Half of the subjects started with the labial plosives and half of the subject 
started with the alveolar plosives. 


5.2 Results and discussion 

The results of the phoneme identification task showed that, overall, 97% of the 
items starting with a voiced plosive were identified as voiced. One item appeared 
to be misrecorded and was therefore removed from all further analyses. Table 2 
shows the percentage of voiced responses in each of the three prevoicing condi- 
tions for HF words and LF words separately. 
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HF words LF words Nonwords 
Experiment 1 =-Prevoicing 12 98.0 98.3 = 
Prevoicing 6 98.6 98.5 — 
Prevoicing 0 93.2 93.4 Ss 
Experiment 2.‘ Prevoicing12 98.7 98.6 99.3 
Prevoicing 6 98.9 99.4 99.4 
Prevoicing 0 93.8 93.5 82.9 
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Table 2: Percentage of voiced responses in the identification task of Experiments 1 and 2 


The proportions of voiced responses were converted through an arcsine trans- 
formation (Studebaker 1985) and submitted to repeated measures subjects (F1) 
and items (F2) analyses of variance (ANOVAs) with the factors frequency and 
prevoicing. There was a main effect of prevoicing (F1(2,94) = 47.89, p < .001; 
F2(2,154) = 38.46, p < .001). No other effects were significant. Tukey honestly 
significant difference (HSD) tests showed that the proportion of voiced responses 
to tokens without prevoicing was significantly lower than those to items with 
prevoicing (either 12 or 6 periods of prevoicing). Nevertheless, these tokens with- 
out prevoicing were in general still perceived as voiced. Note that all items start- 
ing with voiced plosives were words, which could have biased listeners to respond 
with the voiced category. Inspection of the RTs of the identification responses 
suggested that some of the responses were initiated even before the end of the 
prevoicing (in particular when the plosive started with 12 periods of prevoicing). 
Apparently, in some cases the presence of prevoicing alone provided sufficient 
information that the plosive was voiced. Since not all responses were initiated af- 
ter the end of the prevoicing, it was not possible to correct for the length of the 
prevoicing. Therefore, there was no accurate way to analyze the RTs of the identi- 
fication data in this experiment. 

In the associative priming study the effect of the different prevoicing durations 
was investigated by measuring lexical decision RTs to the visual targets. RTs 
were measured from target onset and therefore there was no need to correct for 
differences in the duration of the prime as a result of the prevoicing manipulation. 
The mean latencies of correct lexical decisions to word targets are shown in Fig- 
ure 2. 
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Figure 2. Mean reaction times (RTs) to word targets preceded by high frequency (HF) 
and low frequency (LF) primes in each of the four priming conditions in Experiment 1 
(associative priming). 


Subjects showed semantic facilitation, responding faster to targets preceded by 
semantically related primes than to targets preceded by unrelated primes. Re- 
peated-measures subjects (F1) and items (F2) ANOVAs with prime type (12 peri- 
ods, 6 periods, no prevoicing, unrelated), frequency (HF and LF), phoneme (/b/ 
and /d/) as factors showed significant effects of prime type: F1(3,141) = 29.41, 
p<.001; F2(3,225) = 24.60, p < .001, and of frequency: F1(1,47) = 33.13, 
p< .001; F2(1,75) = 6.33, p < .05. No other effects were significant. In addition, 
t-tests on the following three planned comparisons were carried out: prevoicing 12 
— prevoicing 6, prevoicing 6 — unrelated, and prevoicing 0 — prevoicing 6. The 
outcomes of the two-tailed t-tests showed that the difference between the prevoic- 
ing 6 condition and the unrelated condition was significant (t1(47) = —6.39, 
p< .001; t2(78) = —-5.82, p < .001), but that the other two differences were not 
significant. This indicates that lexical decisions were significantly faster when the 
target was preceded by a semantically related prime than when the target was pre- 
ceded by a semantically unrelated prime, and that the lexical decisions latencies 
were not affected by the degree of prevoicing. There were also no differences 
among the error rates of the three prevoicing conditions. 

The frequency effect indicated that RTs to targets preceded by a HF prime 
were faster than RTs to targets preceded by a LF prime (517 ms versus 537 ms). 
RTs were negatively correlated with target word frequency (r(79) = —0.249, 
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p < 0.05, two-tailed), but were not correlated with prime frequency, showing that 
the frequency effect on RTs was caused by target frequency, not prime frequency. 

These results suggest that the three prevoicing variations tested here (12, 6 and 
O periods) do not influence lexical access. It is possible, however, that the VOT 
variation tested here does influence lexical access but that the associative priming 
task is not sensitive enough to measure an influence of such small acoustic differ- 
ences. Another possibility is that the effect is too short-lived to be observed at an 
inter-stimulus interval of 0 ms. Therefore an identity priming experiment was car- 
ried out in which the visual target was presented earlier relative to the prime. The 
cross-modal version of this task, rather than the intra-modal version, was chosen 
to ensure that differences in the speed of the lexical decisions would reflect a dif- 
ference in the degree of lexical activation rather than a difference in the degree of 
prelexical activation. This argument is in this case even more important since in 
the identity priming task the phonological overlap between prime and target is 
considerable, if not complete. This type of overlap can lead to non-lexical facilita- 
tion when prime and target are both presented auditorily (e.g., Slowiaczek, 
McQueen, Soltano & Lynch 2000). Furthermore, we know from the findings by 
Spinelli, McQueen & Cutler (2003) that the cross-modal identity priming task is 
sensitive to subtle variation in the initial phoneme. 


6. Experiment 2 


6.1 Method 

Participants 

Forty-eight subjects were paid to participate in this experiment. None had partic- 
ipated in the first experiment and none reported any hearing loss. 


Materials 

The same 40 HF words and 40 LF words of Experiment 1 were used in Experi- 
ment 2, but this time the visual target was the same word as the prime. For each 
target an unrelated prime was constructed that had the same number of syllables 
and the same initial phoneme as the related prime. There were also non-word 
primes. In addition to these items there were 200 filler pairs in which there was no 
relation between the prime and the target. They consisted of 40 non-word-non- 
word pairs, 80 non-word-word pairs and 80 word-non-word pairs. All materials 
came from the same recordings as in Experiment 1 and the VOT was manipulated 
in exactly the same way. The design was summarized in Table 1. 


Procedure 

The procedure was identical to that of Experiment 1 except that the visual target 
was presented 200 ms after the onset of the burst. As in Experiment 1, subjects 
were asked to perform a phoneme identification task on the test items after they 
had completed the lexical decision task. To shorten the identification phase, lis- 
teners only had to identify the initial phoneme of each related prime with the same 
amount of prevoicing as the one they had heard in the identity priming experi- 
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ment. Again, half of the items in the identification experiment consisted of dis- 
tractors (this time words and non-words) starting with a /p/ or /t/. 


6.2 Results and discussion 

The results of the identification task indicated that 96% of the items starting 
with a voiced plosive were identified as voiced. Table 2 gives the mean percent- 
age of voiced responses for each prevoicing condition, separately for HF words, 
LF words and non-words. The ANOVAs on the transformed proportions showed a 
significant effect of frequency (F1(2,94) = 16.51, p< .001; F2(2,116) = 3.73, 
p<.05), a_ significant effect of prevoicing (F1(2,94) = 95.66, p< .001; 
F2(2,232) = 51.19, p < .001) and a significant interaction between frequency and 
prevoicing (F1(4,188) = 18.95, p < .001; F2(4,232) = 6.00, p < .001. Tukey HSD 
tests showed that the proportion of voiced responses was higher for words (HF or 
LF) than for non-words and that, as in Experiment 1, tokens without prevoicing 
were less often identified as voiced than tokens with prevoicing. Furthermore, the 
interaction between frequency and prevoicing was due to the fact that the dif- 
ference between tokens with and without prevoicing was larger in the non-words 
than in the HF or LF words. This confirms the suggestion which was made earlier, 
that the identification of voiced plosives without prevoicing was influenced by the 
lexical status of the item. 

Figure 3 shows the mean RTs of the lexical decisions for the four priming con- 
ditions, plotted separately for the three target conditions. Since correct lexical de- 
cisions to the HF and LF targets involved “yes” decisions, while correct lexical 
decisions to the non-word targets involved “no” decisions, words and non-words 
were analyzed separately. In the analysis of the word targets there were significant 
effects of prime type: F1(3,141) = 70.76, p < .001; F2(3,225) = 58.60, p < 001, 
and frequency: F1(1,47) = 324.26, p < 001; F2(1,75) = 102.41, p < 001. There 
was also a significant interaction between prime type and frequency: F1(3,141) = 
6.50, p < .001; F2(3,225) = 5.38, p = .001). 

As in the previous experiment, t-tests on the following three planned com- 
parisons were carried out: prevoicing 12 — prevoicing 6, prevoicing 6 — unrelated, 
and prevoicing 0 — prevoicing 6. Only one pairwise comparison was significant: 
the difference between the prevoicing 6 condition and the unrelated condition: 
t1(47) = -9.80, p < .001; t2(78) = —8.68, p < .001. This indicates that lexical deci- 
sions were significantly faster when targets were preceded by identical primes 
than when targets were preceded by unrelated primes, and that there was no dif- 
ference in the degree of priming among the three prevoicing conditions. 

The significant effect of frequency indicated that lexical decisions were slower 
to LF targets than to HF targets (632 ms versus 529 ms). Note that in this experi- 
ment related primes and targets were identical and therefore LF primes were fol- 
lowed by LF targets and HF primes by HF targets. The significant interaction be- 
tween prime type and frequency was further inspected by performing planned t- 
tests for the three priming condition combinations for LF and HF primes sepa- 
rately. In both frequency groups, only the differences between the prevoicing 6 
priming conditions and the unrelated priming conditions were significant. 
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Figure 3: Mean reaction times (RTs) to high frequency (HF) word targets, 
low frequency (LF) word targets and nonword targets in each of the 
four priming conditions in Experiment 2 (identity priming) 


Figure 3 also shows the mean RTs of the lexical decisions to non-word targets. 
For the non-words there was a significant effect of prime type: F1(3,141) = 5.71, 
p= .001; F2(3,114) = 5.18, p < 0.01. Nevertheless, the three planned t-tests show- 
ed that none of the pair wise comparisons were significant, indicating that the 
priming effect was not as robust as in real words. This suggests that the facil- 
itation measured in an identity priming task is mainly due to activation at the lex- 
ical level rather than activation at the prelexical level. 

The results of the identity priming experiment are comparable to the results of 
the associative priming experiment. Both experiments show facilitation in lexical 
decisions to words when targets are preceded by related primes relative to when 
preceded by unrelated primes. But the VOT manipulations did not affect the de- 
gree of facilitation. It is possible, however, that the differences in prevoicing in 
these experiments were too small to be detectable for the listeners. Even though 
oscillograms of the stimuli clearly show that they differ in the degree of prevoic- 
ing, this does not necessarily mean that listeners can hear these differences. If they 
cannot hear these differences, it would not be very surprising that this type of 
variation does not influence lexical access. To test this, a third experiment was 
conducted in which listeners had to discriminate between the primes with differ- 
ent degrees of prevoicing. 
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7. Experiment 3 


7.1 Method 

Participants 

Ten subjects participated. None reported any hearing loss and none had taken part 
in the first two experiments. 


Materials 

The materials used in this experiment consisted of the 120 test items of Experi- 
ment 2 (40 HF words, 40 LF words and 40 non-words). For each item all three 
prevoicing versions were used (12 periods of prevoicing, 6 periods of prevoicing 
and no prevoicing). For each item 6 pairs were constructed in such a way that all 
combinations of prevoicing (prev) appeared: prev 12 — prev 12, prev 6 — prev 6, 
prev 0 — prev 0 (the “same” pairs) and prev 12 — prev 0, prev 6 — prevO and 
prev 12 — prev 6 (the “different” pairs). The order of items within the “different” 
pairs was balanced. In total there were 720 pairs. 


Procedure 

All pairs were presented auditorily in a sound-damped booth in random order. The 
ISI within a trial was 300 ms and the interval between the offset of a trial and the 
onset of the next trial was 2000 ms. Subjects were asked to listen carefully to the 
two items while concentrating on the beginning of the initial sounds and to decide 
whether the two items were the same or different, by pressing the appropriate but- 
ton. Before the real experiment started they heard 12 pairs that were different. 
They had been told beforehand that there was a difference in the initial phoneme 
between the two items of each pair. After that there was a training session of 24 
pairs prior to the main experiment session. 


7.2 Results and discussion 

Following Macmillan & Creelman (1991), d' was calculated for each subject 
for each prevoicing combination. This is a measure for the listeners’ sensitivity to 
discriminate two stimuli from each other by taking into account both the propor- 
tion of hits and the proportion of false alarms. When d's differ from zero this indi- 
cates that listeners performed above chance. Moderate performance implies that d' 
is near unity (Macmillan & Creelman 1991). 

A one-way ANOVA on the d's indicated that there was a main effect of pre- 
voicing combination: F(2,20) = 24.29, p< .001. A Tukey HSD test showed that 
the combination prev 12 — prev 0 differed significantly from the combinations 
prev 6 — prev 0 and prev 12 — prev 6, but that the difference between the combina- 
tions prev 6 — prev 0 and prev 12 — prev 6 was not significant. Thus it was easier 
to discriminate between two members which differed 12 periods of prevoicing 
from each other than to discriminate between two members which differed only in 
6 periods from each other. Nevertheless, all d's differed significantly from zero 
(prev 12 — prev 0: t(10) = 8.20, p < .001; prev 6 — prev 0: t(10) = 7.11, p < .001; 
prev 12 — prev 6: t(10) = 4.34, p = .001). This indicates that listeners performed 
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above chance and could thus discriminate among all three prevoicing durations. 
The d's of the pairs involving 0 periods of prevoicing also differed significantly 
from unity (prev 12 — prev 0: t(10) = 5.10, p < .001; prev 6 — prev 0: t(10) = 2.92, 
p <.05). This suggests that the difference between 12 and 6 periods of prevoicing 
was the most difficult to detect. 


4- discrimination (same-different) 


d' 


Prevoicing 12 Prevoicing 6 Prevoicing 12 
Prevoicing 0 Prevoicing 0 Prevoicing 6 


Figure 4: Mean d's for the three combinations of prevoicing 
in Experiment 3 (discrimination). 


8. The influence of voiceless lexical competitors on the effect of prevoicing 

differences 

The two priming experiments reported here investigated the influence of pre- 
voicing variation on lexical access and its interaction with the frequency of the 
lexical candidates. Both experiments showed a clear priming effect for words. In 
the associative priming task, listeners were faster to decide that the visual target 
such as roos (‘rose’) was a word when its was preceded by a semantically related 
prime such as bloem (‘flower’) than when it was preceded by a semantically unre- 
lated prime such as baan (‘job’). In the identity priming task, lexical decisions to 
word targets such as bloem were faster when the target was preceded by a prime 
which was identical to the target (in this case the auditory version of bloem) than 
when the target was preceded by unrelated primes, such as baan (‘job’). For the 
non-words no substantial priming effect was found. Furthermore, both experi- 
ments showed no difference among primes with 12, 6 or 0 periods of prevoicing. 

The absence of an effect of prevoicing duration (12 periods of prevoicing ver- 
sus 6 periods of prevoicing) was expected. Recall that van Alphen & Smits (2004) 
found that in Dutch the amount of prevoicing appears to be uninformative to the 
listener. All tokens which were produced with prevoicing were unambiguously 
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identified as voiced, regardless of the exact amount of prevoicing. The primary 
cue to the perception of [+voice] appeared to be the presence of prevoicing, rather 
than the duration of prevoicing. Therefore, we predicted that variation in the 
amount of prevoicing (12 versus 6 periods) would not affect lexical access. 

The absence of an effect of the deletion of prevoicing, however, seems puz- 
zling. Van Alphen and Smits showed that when prevoicing was absent, the prob- 
ability that the token was voiced decreased. Nevertheless, the majority of the 
voiced plosives without prevoicing were still perceived as voiced. This is in line 
with the present identification results: although all tokens in these experiments 
were in general perceived as being voiced, the percentage of voiced responses was 
lower for tokens with no prevoicing than for tokens with 12 or 6 periods of 
prevoicing. As mentioned earlier, several studies have shown that lexical activa- 
tion is sensitive to fine-grained acoustic information, suggesting that information 
flows continuously from a prelexical level of processing to the lexical level. 
Based on these findings one would expect that the deletion of prevoicing would 
affect the degree of activation of lexical candidates starting with voiced plosives. 

It is also the case, however, that prevoicing deletion does not result in unnatu- 
ral or rare tokens. Prevoicing is frequently absent in naturally produced tokens of 
voiced plosives. As a result, Dutch listeners have often encountered words starting 
with plosives without prevoicing that should have started with voiced plosives 
(e.g., hearing bloem without prevoicing). Therefore, Dutch listeners might have 
learned that a plosive without prevoicing could still be voiced. This can explain 
why no effects of prevoicing deletion were found in the current priming studies. 
However, since in natural speech most plosives without prevoicing are actually 
voiceless, listeners should not ignore the presence or absence of prevoicing. To- 
kens without prevoicing should thus activate both voiced and voiceless prelexical 
representations, which in turn should activate lexical candidates starting with 
voiced plosives and lexical candidate starting with voiceless plosives. Note that 
none of the words in the present study had a voiceless word competitor. That is, 
for all words, changing the voicing category of the initial voiced plosive from 
voiced to voiceless resulted in non-words (e.g., ploem is not a Dutch word). 
Therefore, there were no voiceless lexical candidates which could seriously com- 
pete with the voiced word candidates. If it is indeed the case that items starting 
with voiced plosives without prevoicing activate both voiced and voiceless lexical 
candidates, one would expect to find effects of prevoicing deletion when primes 
are used which have a voiceless word candidate that could be activated. 

Van Alphen & McQueen (2006) therefore investigated the influence of the 
competitor environment on the effect of prevoicing variation. They ran two cross- 
modal identity priming experiments similar to Experiment 2 of the present study, 
but, instead of the frequency conditions, they constructed four different lexical 
status conditions. The first condition, referred to as the Blue condition, contained 
word primes starting with voiced plosives which had no voiceless word competi- 
tor, for example blauw (blauw means ‘blue’ and plauw is not a word of Dutch). 
This condition is equivalent to that tested in the present Experiment 2. The second 
condition, referred to as the Bear condition, contained word primes starting with 
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voiced plosives with voiceless word competitors, for example beer (beer means 
‘bear’ and peer means ‘pear’). The third condition, the Blem condition, like the 
non-word condition in Experiment 2, contained non-word primes starting with 
voiced plosives without voiceless word competitors, for example blem (neither 
blem or plem is a word of Dutch). The final condition, the Brince condition, con- 
tained non-word primes starting with voiced plosives which had a voiceless word 
competitor, for example brins (brins is not a word of Dutch and prins means 
‘prince’). 

In both experiments five priming conditions were used. In addition to the three 
prevoicing conditions (prevoicing 12, prevoicing 6, prevoicing 0) and the unre- 
lated priming condition which were also used in the present study, a voiceless 
priming condition was constructed which contained natural recordings of the 
voiceless word and non-word counterparts of the voiced primes. This voiceless 
priming condition (e.g., the prime peer) served together with the voiced priming 
condition (e.g., the prime beer with 6 periods of prevoicing) as reference condi- 
tions for the condition with voiced primes without prevoicing. The combination of 
five priming conditions and four lexical status conditions resulted in 20 different 
conditions in each experiment. 

The difference between the two priming experiments was the nature of the tar- 
get. In the first experiment the items starting with voiced plosives served as tar- 
gets, while in the second experiment the voiceless counterparts of the voiced 
items served as targets. For example, in the Bear condition, in the first experiment 
the target was beer and in the second experiment it was peer. In this way the de- 
gree of activation of both the voiced and voiceless lexical candidates (e.g., beer 
and peer) could be measured. 

The results showed clear priming effects when prime and target were identical. 
Furthermore, the RT patterns showed that when prime and target differed only in 
the phonological voicing of the initial phoneme (for example, the prime was peer 
and the target was beer) no facilitatory effect was found. As in the present ex- 
periments, there was never an RT difference between the effect of a prime with 12 
periods of prevoicing and that of a prime with 6 periods of prevoicing. As in the 
present experiments, there was also no effect of prevoicing deletion when the 
voiced prime had no voiceless word competitor. Crucially, however, when the 
voiced prime did have a voiceless word competitor, effects of prevoicing deletion 
were found. For example, when word targets such as peer were preceded by 
voiced primes without prevoicing (beer without prevoicing), lexical decisions to 
targets were faster in comparison to the same targets preceded by voiced primes 
with prevoicing (e.g., beer with 6 periods of prevoicing), but slower in compari- 
son to the same targets preceded by voiceless primes (e.g., peer). Similarly, when 
a non-word target such as brins was preceded by voiced primes without prevoic- 
ing (e.g., brins without prevoicing), lexical decisions (in this case “no” decisions) 
were slower in comparison to decisions to the same targets preceded by voiced 
primes with prevoicing (e.g., brins with 6 periods of prevoicing) and faster in 
comparison to these targets preceded by voiceless primes (e.g., prins). For a de- 
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tailed description of the patterns found in all conditions see van Alphen & 
McQueen (2006). 

These results suggest that items starting with voiced plosives without pre- 
voicing significantly activate both lexical candidates starting with voiced plosives 
and lexical candidates starting with voiceless plosives. Items with prevoicing do 
only significantly activate lexical candidates starting with voiced plosives. In none 
of the lexical status conditions an effect of the difference between 12 and 6 peri- 
ods of prevoicing was found. It thus appears that the recognition system is more 
sensitive to variation in the speech signal that is more important for lexical dis- 
tinctions. The presence or absence of prevoicing appears to be the primary cue for 
the voicing distinction in Dutch while variation in the exact duration of prevoicing 
is not very important. As a result, a difference between the presence and absence 
of prevoicing (6 versus 0 periods of prevoicing) influences lexical access more 
strongly than a difference in the amount of prevoicing (12 versus 6 periods of 
prevoicing). Effects of the absence of prevoicing, which is relevant for word rec- 
ognition, were only observed when there was a voiceless lexical candidate. When 
there was no such candidate, the voiced word candidate was the only plausible 
lexical hypothesis and could easily win the competition with all other candidates, 
even when there was no prevoicing. 


9. Conclusions 

In this article I have focused on the phonological voicing distinction in Dutch 
initial plosives. The phonological distinction between [b] and [d] on the one hand 
and [p] and [t] on the other hand, is straightforward: the former are voiced and the 
latter are voiceless. The phonetic realization of this distinction in Dutch, however, 
is less straightforward. Voiced plosives are said to be produced with voicing dur- 
ing the closure (i.e., with a negative VOT) while voiceless plosives are produced 
without voicing during the closure but with little or no aspiration (i.e., with a posi- 
tive VOT). The study on the occurrence of prevoicing in Dutch revealed that a 
considerable proportion of voiced plosives (25%) were produced without prevoic- 
ing. When the aerodynamic circumstances made it more difficult to produce vocal 
vibration, prevoicing was often absent. Nevertheless, these tokens could still be 
perceived as voiced, provided that the remaining acoustic cues were in favour of a 
voiced plosive. This last condition, however, was not always met. As a result, 
some of the voiced tokens without prevoicing were perceived as voiceless. In con- 
trast, all tokens produced with prevoicing were perceived as voiced. 

The presence of prevoicing is thus a very strong cue to the perception of plo- 
sives as voiced, but is nevertheless not always realized by speakers when pro- 
ducing voiced plosives. This is an intriguing paradox. How does the speech per- 
ception system treat the absence of prevoicing? On the one hand, listeners have 
learned that the absence of prevoicing strongly signals that the token is voiceless; 
on the other hand, listeners have often encountered words with plosives without 
prevoicing which appeared to be voiced. The identification results showed that the 
absence of prevoicing influenced the proportion of voiced responses. Although 
the majority of the tokens without prevoicing were perceived as voiced, some lis- 
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teners perceived some of these tokens as voiceless. What consequences does this 
have for the recognition of words starting with voiced plosives without prevoic- 
ing? Is a word like bloem still recognized when it is produced without prevoicing, 
or is it sometimes recognized as ploem? Even if it is correctly recognized, the ab- 
sence of prevoicing could still have affected the recognition process. It is possible 
that words starting with plosives without prevoicing are more difficult to recog- 
nize than words with prevoicing. In order to fully understand the role of prevoic- 
ing in perception it is therefore important to also include word recognition. 

Two priming experiments were presented which investigated the effects of 
prevoicing deletion on lexical activation. The difference between the presence and 
absence of prevoicing was contrasted with a difference in the amount of present 
prevoicing. Both prevoicing differences were of the same size and fell within the 
natural range of prevoicing variation. The prediction was that these two types of 
prevoicing variation differ, however, in their informational value. While the pres- 
ence or absence or prevoicing is relevant to the voicing distinction, the exact dura- 
tion of the prevoicing is not. The hypothesis was that only acoustic detail which 
would help to distinguish between two phoneme classes, and thus between lexical 
candidates, would affect lexical activation, while irrelevant acoustic detail would 
be normalized away at the prelexical level. Therefore, the difference involving the 
presence or absence of prevoicing was expected to affect lexical activation, while 
the difference between 12 and 6 periods of prevoicing was not. 

The results suggested that neither of the two differences in prevoicing had an 
effect on the degree of lexical activation. I argued that the absence of an effect of 
the deletion of prevoicing could be explained by the fact that prevoicing is fre- 
quently absent in Dutch. Dutch listeners have often encountered words starting 
with plosives without prevoicing that should have started with voiced plosives 
(e.g., hearing bloem without prevoicing). Therefore, they might have learned that 
a plosive without prevoicing could still be voiced. When the words starting with 
voiced plosives had no matching voiceless word competitor, as was true for all 
primes in the two present priming experiments, the lexical candidate starting with 
the voiced plosive was considered to be the only plausible lexical hypothesis 
when listeners heard these words without prevoicing. This argument was 
strengthened by the results of two different priming experiments in which the 
competitor environment of the words with initial voiced plosives was manipulated 
(van Alphen & McQueen 2006). When the primes had a voiceless word competi- 
tor, an effect of prevoicing deletion was observed. But there was never an effect 
of the difference between 12 and 6 periods of prevoicing. 

The results of these priming experiments show that word recognition is more 
sensitive to phonetic detail that is more important for phonemic distinctions (and 
thus for lexical distinctions). It also shows the robustness of the word recognition 
system and the influence of the lexical competitor environment. When there is no 
voiceless word competitor, the recognition system can easily recover from effects 
of prevoicing deletion, probably due to the fact that this type of variation naturally 
occurs in Dutch. Only when there is a voiceless word competitor can prevoicing 
deletion substantially affect the recognition process. 


122 PETRA M. VAN ALPHEN 


By combining the results of production and perception experiments, including 
experiments involving word recognition, I aimed to give more insight into the 
voicing distinction in Dutch initial plosives and the role of prevoicing. In particu- 
lar, I intended to show that the distinction between voiced and voiceless plosives 
is less straightforward than one might expect on the basis of the phonological de- 
scription of these sounds. It appears that the recognition system does not make a 
simple binary distinction between voiced and voiceless plosives (such as 
[+voiced] and [—voiced]), but that there are different degrees of voicing. The pho- 
netic realization of a plosive determines the probability that the plosive is voiced. 
Of all acoustic properties, prevoicing appears to be one of the most important cues 
affecting this probability. Nevertheless, voiced plosives are frequently produced 
without prevoicing. This not only affects the role that prevoicing plays in the per- 
ception of plosives as voiced or voiceless, but also the way in which the word rec- 
ognition system treats variation in prevoicing. 
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Dutch Regressive Voicing Assimilation as a 
“Low Level Phonetic Process’ 
Acoustic Evidence 


Wouter Jansen 
UCE Birmingham 


This paper investigates the behaviour of a number of acoustic cues to phonological 
[voice] in Dutch word-final /ps/ sequences. It reports measurements on elicited pro- 
ductions of such clusters before phonologically voiced and voiceless plosives, the 
labial nasal /m/, the glottal consonant /h/ and lexical vowels. The results of this ex- 
periment provide evidence that regressive voice assimilation (RVA) occurs in /ps/ 
clusters before [+voice] plosives, contradicting claims in some of the literature that 
obstruent + fricative sequences are exempt from RVA. The behaviour of the indi- 
vidual cues to [voice] observed here also suggests that Dutch regressive voicing 
assimilation is a ‘low level’ coarticulatory process rather than a rule manipulating 
lexical phonological structure. 


1. Introduction 

Phonological accounts of Dutch regressive voicing assimilation (RVA) tend to 
regard the process as asymmetric in more than one respect. First, the obstruents 
targeted by this assimilation rule are subject to an independent rule of final laryn- 
geal neutralization or ‘final devoicing’, which (on most accounts) means that 
voicing assimilation does not occur, or is rendered vacuous, before phonologically 
[—voice] obstruents. Laryngeal neutralization applies across the board in word- 
final environments and consequently assimilation does not have to be invoked to 
account for the voiceless realization of the final obstruent of underlying /zand/ in 
a compound such as [zantpla:t] ‘sandbar’. Instead, this realization can be attrib- 
uted to the same constraint or rule that makes /zand/ + /lo:por/, ‘hourglass’ surface 
as [zantlo:per]. An SPE-style linear version of this rule appears in (1a) below (cf. 
(1) in Zonneveld, this volume). 

In other words, only [+voice] obstruents are commonly regarded as being ca- 
pable of triggering (observable) RVA in the phonology of Dutch. This means that 
the relevant rule or constraint can be formulated with reference to [+voice] only, 
as in (1b), although a symmetric version produces exactly the same result (cf. rule 
(2) in Zonneveld, this volume). 


(1) a. Final devoicing 
[-son] — [-voice] /__ # 
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b. Voicing assimilation 


—son 
[-son] — [+voice] / __ —cont 
e 


+voic 


c. Fricative devoicing 
[-son, +cont] — [-voice] / [-son] __ 


Second, virtually all descriptions of the language agree that among Dutch [+voice] 
obstruents only the plosives trigger regressive voicing assimilation: the [+voice] 
fricatives /v, z/, and in southern varieties /y/, are devoiced when preceded by an 
obstruent and fail to trigger voicing assimilation in preceding obstruents. Thus, 
whereas the word-final stop of /zand/ may be voiced through RVA in for instance 
[zandbak] ‘sandbox’, the medial cluster in /zand zak/ ‘sandbag’, surfaces as en- 
tirely voiceless: [zantsak] (or perhaps [zantzak]). (1c) represents a linear version 
of the fricative devoicing rule (cf. (3) in Zonneveld, this volume). 

Under this analysis, then, Dutch regressive voicing assimilation is (trivially) 
asymmetric with respect to the feature [voice] (or some formal equivalent) in that 
it is only triggered by one of the 2 possible values, and it is asymmetric with re- 
spect to manner of articulation because it can only be triggered by plosives. 

In light of the second asymmetry, it is perhaps surprising that relatively few 
accounts of Dutch RVA consider the behaviour of clusters composed of word- 
final stop + fricative combinations followed by [+voice] stops, as in, e.g., /rerks/ + 
/daildor/ ‘Dfl 2.50 coin (proper name)’ or /fi:ts/ + /bel/ ‘bicycle bell’. In such se- 
quences, regressive voicing assimilation and fricative devoicing clash in the sense 
that the former requires at least the medial fricative to be voiced, whereas the lat- 
ter demands that the medial fricative be voiceless. Thus, the behaviour of obstru- 
ent + fricative + voiced plosive clusters would seem to provide some useful clues 
to the proper formulation, and/or ordering (or ranking) of the formal rules in (1) 
above. 

Brink (1975) is one of the few authors to discuss this issue within a generative 
phonological framework. Citing earlier work on Dutch, he states categorically that 
no RVA takes place in obstruent + fricative + [+voice] plosive sequences. Brink 
claims that words such as /fi:ts/ + /bel/ can be pronounced with a phonetically 
prevoiced [+voice] plosive, as in [fi:tsbel], or with a (partially) devoiced [+voice] 
plosive as in [fi:tsbel], but are never realized with any voicing in the obstruents 
preceding obstruents: *[fi:tzbel], *[fi:dzbel]. This view is echoed by Cammenga 
& van Reenen (1980), who criticize the account of Dutch obstruent voicing in 
Booij (1981) for predicting RVA in this type of cluster. On the basis of these 
claims, rule orderings or constraint rankings would have to give priority to frica- 
tive devoicing (1c) over voicing assimilation (1b). 

This paper discusses a production experiment that was designed to test the 
above assertions by Brink (1975) and Cammenga & van Reenen (1980). Four na- 
tive speakers of Dutch were asked to produce word-final /p/ + /s/ sequences in a 
range of contexts. Their productions of these clusters were then examined with 
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respect to a number of known acoustic correlates of [voice], including phonetic 
voicing and segmental durations. The results of this experiment contradict the 
claim that no assimilation occurs in stop + fricative + stop sequences. In particu- 
lar, clear differences in phonetic voicing between /ps/ clusters preceding [—voice] 
and [+voice] plosives indicate that, at least in production, these clusters are sub- 
ject to regressive assimilation of voice. 

The implications of the experimental data reported below are potentially more 
far-reaching than this, however, in that they cast doubt on any model of Dutch 
RVA constructed along the lines of (1) above. Clusters of /p/ + /s/ followed by the 
[—voice] plosives /p, t/, the bilabial nasal /m/, and [+voice] /b, d/ appear to exhibit 
a three-way voicing contrast which suggests that Dutch regressive voicing assimi- 
lation is not asymmetric with regard to [voice] as often assumed, but behaves 
symmetrically, with nasals (and other sonorant consonants) acting as an ‘interme- 
diate’ context between [+voice] and [—voice] plosives. 

This finding and additional observations concerning the behaviour of /p/ + /s/ 
clusters before word-initial /h/ and lexical vowels might be accounted for in a 
drastically revised version of (1), but the nature of the effects involved as well as 
considerations of economy point to an account of Dutch RVA along the lines of 
proposals in Ermestus (2000) as the anticipatory coarticulation of laryngeal ges- 
tures. 


2. Methods 


2.1 Subjects 

Subjects were 4 native speakers of Dutch aged between 21 and 45 at the time 
of recording. Two of the subjects were male (MJ1 and GBP3) and two were fe- 
male (ER2 and LB4). None of the subjects had a history of speech or hearing im- 
pairment. They were not paid for their participation in the experiment. All speak- 
ers were residents of the town of Groningen at the time of recording, but spoke 
varieties of Dutch which can be roughly described as standard with minor (north- 
ern and western) local features (see further section 3.1 below). 


2.2 Materials 

The stimuli for this experiment consisted of clusters combining an initial /p/ 
C,, and a medial /s/ Cy followed by a /p, t, b, d, m, h/ C3 or an unreduced lexical 
vowel (/V/), which is usually preceded by a [?] in Dutch.’ Although there is evi- 
dence that final laryngeal neutralization is phonetically complete in Dutch 
(Baumann 1995), C; obstruents were consistently /p/ and orthographic <p>, to 
avoid any potential bias due to spelling pronunciations or other incomplete neu- 
tralization effects.’ 

Cis were embedded in a noun (N;) representing a proper name. Ns consisted 
of a single syllable that had the long low unrounded vowel /a:/ for its nucleus; this 
vowel is referred to as V, in the discussion below. The medial /s/ always repre- 
sented an adjectival marker, as in /ka:p/ + /s/, ‘of, from, pertaining to the Cape’ or 
a possessive marker as in /ja:p/ + /s/, ‘belonging to Jaap’. The carrier words (N2) 
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for C3 were disyllabic nouns with an initial lexical stress. C3 always preceded a 
long vowel or (phonotactically long) diphthong. The carrier words (N2) for C3 
were disyllabic nouns with an initial lexical stress. The Ni+ N>2 collocations were 
further embedded in carrier sentences designed to attract a contrastive nuclear ac- 
cent on No. Some sample stimuli (orthographic and phonological representations) 
appear in (2). Target clusters are underlined. 


(2) | Sample stimuli 
a. < Het was Jaaps tunnel die onder water stond, niet zijn kelder> 
/ het vas japs  tynol di onder va:ter stond nit zon _ keldor/ 
It was Jaap’s tunnel that under water stood not his basement 
‘It is Jaap’s tunnel that was flooded, not his basement’ 
b. <Het was een Kaaps meisjedat de  hoofdsprijswon niet een Kaaps jongetje 
/het vas on kaips meisjadat do ho:vdpreis von nit on  kaips jonatjo 
It was a Cape girl who the head-prize won not a Cape boy 
‘It was a little girl from the Cape who won the first prize, not a little boy from the Cape’ 


2.3 Procedure 

The stimuli were presented to the subjects in a quasi-randomized order to 
avoid consecutive stimuli with identical consonant clusters. The subjects were 
asked to read the list of stimulus sentences 3 times. For the first, Normal reading, 
they were asked to read the stimulus items at a self-selected comfortable rate. In 
an attempt to simulate a noisy environment, the subjects were then fitted with 
sound-treated headphones conveying a 80 dB white noise signal (a noise level 
roughly comparable to that on a moving city bus) for the second reading, and 
asked to speak in such a way that they could understand their own speech. The 
aim of impoverishing the subjects’ auditory feedback was to elicit a more hyperar- 
ticulated speech variety that is sometimes referred to as the Lombard reflex 
(Lombard 1910; see Junqua 1996 for an overview). For the third, Fast, reading, 
subjects were asked to read the stimulus items as fast as possible in order to create 
a bias to more hypoarticulated speech. Data concerning the speech rate variation 
that this design was intended to elicit are discussed in chapter 7 of Jansen (2004), 
but the effects in question are irrelevant for present purposes, and consequently all 
data reported below are pooled across reading tasks. 

During each of the three readings subjects were asked to repeat an item if they 
produced a hesitation or speech error that was clearly audible to the experimenter 
and which affected the target cluster. In total, 1 (C1 = /p/, C2 = /s/) * 7 (Cs) * 10 
(stimuli) * 3 (conditions) * 4 (speakers) = 840 utterances were recorded. Re- 
cordings were made onto minidisk in a sound-proofed room using a Brtiel and 
Kjer condenser microphone (Type 4165) and measuring amplifier (Type 2609), 
and digitized at 22.5 kHz. Segmentation and acoustic measurements were carried 
out using the signal analysis package PRAAT (version 3.9). 31 utterances had to 
be discarded because they contained a pause between C> and C3 or small speech 
errors, leaving 809 utterances for segmentation and analysis. 
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Frequency (Hz) 


1334 1384 1475 1545 1626 1752 1802 
Time (ms) 
Figure 1: Sample segmentation of glottal stop preceded by /ps/. Broad band 
spectrogram of a /ps/ + [7] cluster: subject = ER2, condition = Normal 


Segment boundaries were determined by visual inspection of waveforms and 
broadband spectrograms based on Fast Fourier Transforms (FFT) on a 5 ms Gaus- 
sian window (spectrogram bandwidth 260 Hz). The boundary between a vowel 
and a following plosive C; was placed where there was an abrupt change in the 
higher frequency energy, as illustrated by Figure 1. The boundary between the C; 
plosive and the following /s/ was placed at the end of the release burst of the plo- 
sive, where this could be reliably identified as distinct from the frication noise of 
the fricative. In most cases, however it was impossible to segment the release 
burst from the fricative, and the boundary between C; plosive and /s/ was placed 
at the end of the closure stage of the former. The offset of the C2 fricative was de- 
fined as the offset of frication noise. A glottal stop was marked as such if there 
was evidence of irregular glottal pulsing in the signal (this was virtually always 
the case): for an example, see Figure 1. 

The measurements that were made on the basis of the hand-segmented speech 
samples, as well as the relevant derived measures are listed in Table 1, ordered by 
speech segment. All measures relevant to the C; + C» cluster and C3 plosives are 
known phonetic correlates of [voice] in Dutch (e.g., Slis & Cohen 1969) and 
therefore potentially subject to the effects of RVA. Note that all measurements in 
the column for C; + Cy were performed twice, that is, for /p/ and /s/ individually. 
Finally, the first measurement point for Fo was placed at 10 ms after the onset of 
post-release voicing for plosives. 
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Segment 
Vi C,+Co C3 V2 
(a) Duration (d) Duration (g) VOT (stops) (j) Fo 10-50 ms 
after C3 release 
(stops, /m/) 
(b) Fo 50-10 ms (e) Voicing dura- (h) Duration of 
before Cj onset tion initial voiceless 
period (/h/) 


(c) F; 50-10 ms (f) Voicing ratio (i) Fo 10-50 ms 
before Cj onset after the onset 
of voicing(/h/) 


Table 1: Acoustic measurements and derived measures 


3. Results 


3.1 Phonetic features of C3 plosives 

Unlike western and southern varieties of Dutch, the local varieties of the prov- 
ince of Groningen, where all of the subjects were resident at the time of recording, 
realize [—voice] stops as voiceless aspirated in word-initial and pre-stress medial 
contexts. This raises the possibility that the dialects in question produce [+voice] 
stops with a zero to short lag VOT rather than prevoicing, at least utterance ini- 
tially and following other obstruents, and thus that they possess a VOT contrast 
along the lines of (standard varieties of) English and German (unfortunately I am 
not aware of any instrumental evidence that could settle this issue). Given that the 
presence of prevoicing appears to be a prerequisite for [+stops] to be able to trig- 
ger RVA (Kohler 1979, Jansen, 2004) it is important to assess the phonetic reali- 
zation of the plosives in C3 position before considering their effects on preceding 
obstruents. 

The histograms in Figure 2 depict the frequency distributions of the VOT val- 
ues for [-voice] /p, t/ and [+voice] /b, d/ in C3 position. The left pane of this figure 
shows how the majority of VOT values for the [—voice] plosives falls within the 
0-35 ms range that is usually labelled as short lag (91% of tokens < 30 ms). The 
mean VOT for /p, t/ is 16 ms with a 9 ms standard deviation. 

The right pane of Figure 2 shows a bimodal distribution of VOT values for 
/b, d/, with a first peak well within the prevoiced range between —100 ms and —50 
ms, and a second peak in the short lag (> 0 ms) range. The overall mean VOT for 
/b, d/ is -54 ms (standard deviation 45 ms), whilst the mean for tokens in the 
prevoiced range (VOT < 0 ms) is —71 ms (standard deviation 33 ms). There is 
some interspeaker variation in the production of fully voiceless (VOT > 0 ms) 
[+voice] plosives: speakers MJ1 (19/57 tokens, 33%), and ER2 (19/60 tokens, 
32%) show a somewhat stronger tendency to devoicing than speakers GBP3 (4/58 
tokens, 7%) or LB4 (9/58 tokens, 16%). However, the overall proportion of de- 
voiced productions of /b, d/ found here (51/233 tokens, 22%) seems roughly 
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equivalent to the 25% devoicing rate for Dutch word-initial /b, d/ reported by van 
Alphen (this volume). 


Frequency 

80 120 
Frequency 
16 20-25 30 35 


40 
10 


w 
o Oo 
er) | ii) (ee [Rae Cae [aa 
0 50 -200 -150 -100 -50 0 50 
VOT of /p, t/ VOT of /b, d/ 


Figure 2: Histograms of the VOT of C; plosives. Left panel: /p, t/; right panel: /b, d/ 


Figure 3 depicts the mean F» trajectories during the first 50 ms following the onset 
of post-release voicing for [+voice] plosives (female speakers only). The mean Fo 
trajectory following /m/ is included for comparative purposes. These trajectories 
and the highly similar results for the 2 male speakers suggest that the 4 speakers 
investigated here use Fo perturbations as well as differences in VOT to 


Fo (Hz) 


10 20 30 40 50 
Distance from Cz offset (ms) 


Figure 3: Fy (Hz) 10-50 ms into the vowel following C3, for female speakers only. 
P represents /p, t/; B signifies /b, d/. Clusters ending in /h/ and /V/ excluded. 
Error bars represent the mean +1 standard deviation 
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cue [voice] contrast in word-initial plosives. At 10 ms into the voiced portion of 
the following vowel the difference in Fy between [—voice] and [+voice] stops is 44 
Hz (274 vs. 230 Hz) for the female speakers and 31 Hz (214 vs. 183 Hz) for the 
male subjects. These values do not appear to be exceptional in light of data re- 
ported elsewhere in the literature: for example L6fqvist et al. (1989) report a 27 
Hz difference at post-release voicing onset for their male Dutch subject. 

A final point of interest regarding Figure 3 is that the mean Fp values follow- 
ing the baseline C3 consonant/m/ appear to pattern with those for [+voice] /b, d/, 
rather than with the values found after [-voice] plosives or halfway between 
[+voice] and [—voice] plosives. A similar pattern has been observed for English by 
e.g., House & Fairbanks (1953) and Jansen (2004). 

A one-way ANOVA for C3 laryngeal specification (/p, t/ vs. /b, d/ vs. /m/) on 
the Fo values at 10 ms into the following vowel (female speakers only, clusters 
ending in /h, V/ excluded) shows a highly significant effect: F(1,298) = 31.55, 
p< .001. Tukey and Scheffe post-hoc tests show that the [—voice] stops are dis- 
tinct from both the [+voice] stops and sonorant /m/ (p < .001 for all pairwise com- 
parisons), whilst the means for the latter 2 groups are not significantly different. 

In sum, it seems safe to infer that the 4 subjects on whose speech this study is 
based employ VOT and Fp microprosody to signal the [voice] contrast in word- 
initial plosives in broadly similar fashion to speakers of (standard) Dutch investi- 
gated elsewhere in the literature. 


3.2 Cy + Co voicing 

Phonetic voicing is a cue to phonological [voice] word initially and medially 
in Dutch, and it therefore seems appropriate to use the amounts of voicing found 
in obstruents (potentially) targeted by the process as a measure of RVA. Note in 
this regard that the instrumental study by Slis (1986) uses phonetic voicing as the 
key indicator of voicing assimilation (see further section 3.3 below). 

Figure 4 represents the mean duration of the voiced intervals of /p/ (C,) and /s/ 
(C2) across C3 contexts. Perhaps the most striking aspect of this diagram is the 
marked increase in voicing before [+voice] stops relative to the remaining envi- 
ronments: the overall difference in C; + C2 voicing between [+voice] and [—voice] 
C3 contexts is 27 ms. This suggests that, contrary to the assertion by Brink (1975) 
and others, word-final /p/ + /s/ clusters are subject to some form of regressive 
voicing assimilation in Dutch. 

A second noteworthy feature of Figure 4 is the seemingly disparate behaviour 
of the 3 baseline C3 contexts. The amount of C, + C) voicing before /h/ and lexi- 
cal vowels (phonetic [?]) is virtually identical to the amount of voicing observed 
before [—voice] plosives. The mean duration of C; + C2 voicing before word- 
initial /m/ however, falls almost exactly halfway between the amounts of voicing 
found before [—voice] (13 ms difference) and [+voice] plosives (14 ms differ- 
ence). 
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0) 25 50 75 
C, + Cp voicing (ms) 


Figure 4: Voicing of C, + C2. Mean duration of the voiced intervals of /p/ 
(leftmost segments, light grey fill) and /s/ (rightmost segments, dark grey) 
across C3 contexts. P represents /p, t/; B signifies /b, d/. Error bars represent 
the mean + 1 standard deviation. All values in ms. 


Perhaps surprisingly, this implies that the word-final /ps/ sequences exhibit a 
three-way assimilatory pattern, with __/p, t/, __/h/, and __/V/ acting together as a 
‘voiceless’ environment, __/m/ as an ‘intermediate’ environment, and __/b, d/ asa 
‘voiced’ environment. On the assumption that __/m/ represents a true ‘non- 
assimilation’ environment, this in turn suggests that Dutch RVA is symmetric in 
regard to the feature [voice], with both [+voice] and [—voice] obstruents triggering 
regressive voicing assimilation. 

A one-way ANOVA for C3 laryngeal specification was performed on the C; + 
C> voicing data to test these impressionistic observations. The 3 baseline C3 envi- 
ronments __/m/,__/h/, and ___/V/ were included as three separate laryngeal speci- 
fications in addition to [—voice] (/p, t/) and [+voice] (/b, d/). This ANOVA shows 
a highly significant effect, F(4,804) = 48.92, p < .001, which indicates that the C3 
context indeed has some assimilatory effect on the voicing of preceding /ps/ clus- 
ters. 

Next, Tukey and Scheffe post hoc tests were performed to establish which 
pairwise comparisons of the means displayed in figure 4 show statistically signifi- 
cant differences. The results of these post hoc tests, which are summarized in 
Table 2, indicate that all pairwise comparisons of the means for __/p, t/, __/b, d/, 
and __/m/ show statistically significant differences. This suggests that as far as C; 
+ C, voicing is concerned these contexts should be regarded as distinct from each 
of the others, and thus that Dutch RVA is at least qualitatively [voice]-symmetric. 
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/p, t/ | /b, d/ /m/ /h/ IV/ 
/p, t/ a ig nS. ILS. 
/b, d/ oe oe oe oe 
/m/ ok ok ok ok 
/h/ ns. 7 * ns. 
IV/ ns. - * ns. 


Table 2: Results of Tukey and Scheffe post hoc tests on the ANOVAs for C; + C2 voicing: 
*: significant difference (p < .05) according to both tests 


Note, moreover, that the differences between both __/h/ and __/V/ on the one 
hand, and __/b, d/ as well as __/m/ on the other hand are statistically significant 
according to both tests, which indicates that the effect of the (phonetic) glottals on 
the voicing of a preceding cluster is indeed likely to be distinct from that of the 
[+voice] plosives and /m/. 


3.3 C1 + C2 voicing ‘classes’ 

In his study of regressive voicing assimilation in Dutch two-way obstruent 
clusters, Slis (1986) employs a technique of quantifying RVA that is different 
from the method used in the previous section. Slis classifies all obstruents preced- 
ing a [+voice] plosive C, as ‘unassimilated’ if they have a Voice Termination 
Time (VTT) or ‘voice tail’ of < 50 ms and as ‘showing regressive assimilation’ if 
their VIT exceeds 50 ms. The cut-off point is based on the VIT of singleton 
intervocalic stops in Dutch as measured by Slis (1970), which indicates that there 
is a probability < .0025 that [-voice] stops have a VIT equal or greater to the 
mean of 25 ms + 2 standard deviations (10 ms) + 5 ms = 50 ms. 

The method for quantifying RVA employed by Slis (1986) has two potential 
advantages, even if Dutch RVA turns out to be a gradient phonetic process that 
renders any description in terms of discrete categories completely arbitrary. First, 
it exposes the size of the effect of [+voice] on C; voicing duration relative to the 
inherent variance within a laryngeal category in an intuitively transparent way. 
Second, the relative magnitude of [+voice] effects vis-a-vis the noise caused by 
within-category variation may be used as a rough indicator of perceptual salience: 
O’Shaughnessy (1981) suggests that effects smaller than or equal to a single stan- 
dard deviation from a baseline mean should be treated as below the threshold of 
perception. 

There are several ways in which Slis’s method can be applied to the C; + C2 
voicing data from the present experiment. Assuming that assimilation of C; + C2 
voicing is indeed [voice]-symmetric, a natural procedure is to define three ‘voic- 
ing’ classes using the overall mean C, + C) voicing duration of 31 ms and its 
standard deviation of 26 ms. This yields a ‘voiceless’ category or ‘band’ with rela- 
tively short voiced intervals, a ‘neutral voicing’ class centred around the overall 
mean, and a ‘voiced’ category with relatively long voiced intervals. 

Regardless of the precise settings of the boundary values delimiting the three 
categories, the picture of Dutch regressive voicing assimilation that emerges from 
this method does not seem to be substantially different from the one drawn in the 
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previous section. Figure 5 depicts the classification of /ps/ clusters if the neutral 
band is defined as the overall mean of 31 ms + 1 standard deviation of 26 ms. 
There are very few ‘voiced’ clusters preceding [—voice] plosives, /h/, or /V/, 
whereas 30% of the cases before /b, d/ belong in this class, with the __/m/ context 
almost exactly halfway in between (15%). The frequencies of ‘devoiced’ clusters 
hint at the same natural classes of assimilation environments, with similar fre- 
quencies before /p,t/ and /h/ and /V/ (17, 18, and 15% respectively), and lower 
figures for /m/ (8%) and /b,d/ (3%). 

It is interesting to compare the frequency of ‘voiced’ (‘assimilated’) final clus- 
ters found in this study to the numbers reported by Slis (1986). He reports that 
[+voice] stops trigger 86% ‘regressive assimilation’ in preceding singleton plo- 
sives across a word boundary and before stress, which is considerably higher than 
the proportion of ‘voiced’ cases before [+voice] plosives reported here. Note, 
however, that in the classification illustrated in figure 5 the cut-off point between 
the ‘neutral’ and ‘voiced’ bands is 57 ms of voicing as opposed to Slis’s 50 ms. If 
the cut-off point is lowered to 50 ms the proportion of ‘voiced’ cases before 
[+voice] plosives rises to 36%, which is identical to the frequency of ‘assimilated’ 
singleton fricatives found by Slis (in the relevant environment). This implies that, 
however real, the effect of RVA on plosive + fricative clusters are in some sense 
weaker than the effect on singleton plosives and therefore perhaps less audible. 
This may in turn account for the claims in the descriptive literature that regressive 
voicing assimilation does not apply to plosive + fricative clusters in Dutch. 


Voicing ‘class’ 
Voiced 
(J Neutral 
(_] Devoiced 


0 10 20 30 40 50 60 70 80 90 100 
Percent of cases 


Figure 5: Frequencies of ‘devoiced’, ‘neutral’, and ‘voiced’ /ps/ clusters before 
[-voice] stops, [+voice] stops, /m, h, V/ if the neutral category is defined as 
31 + 26 ms (the overall mean +1 standard deviation). Numbers inside columns 
represent numbers of cases 


3.4 C, + C> duration 

Segmental duration has often been regarded as a cue to [voice], at least in 
word-medial environments (Chen 1970, Raphael 1981, Luce & Charles-Luce 
1985). There is some doubt regarding the robustness of stop occlusion duration as 
a correlate of [voice] in naturalistic speech (Crystal & House 1988), but there does 
seem to be a tendency for stop release bursts of [—voice] plosives and the frication 
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intervals of [—voice] fricatives (cf. Stevens et al. 1992) to be longer than those of 
their [+voice] counterparts. 

In light of the cross-linguistically recurrent role of segmental duration in sig- 
nalling [voice] contrast, it seems worthwhile to investigate whether it reflects the 
effects of RVA, although there is no a priori reason to expect it to do so. General- 
ising the way [voice] maps into segmental duration to the effects of RVA then, 
obstruents would be expected to shorten before [+voice] sounds and/or that they 
lengthen before [—voice] sounds when assimilation applies. 

Figure 6 depicts the mean durations of C, and C» across C3 environments. 
Note first of all, how the means for C2 extend across a larger range (20 ms) than 
those for C; (8 ms). This suggests that any effects of RVA on obstruent duration 
are relatively ‘local’ in being restricted to sounds immediately adjacent to the 
trigger consonants. A second noteworthy feature of figure 6 is that before [—voice] 
plosives, [+voice] plosives and /m/ the mean durations for C; + C, overall and for 
C, separately are consistent with the assimilatory pattern of lengthening and 
shortening described above. Thus, there is a 22 ms positive difference in C; + C2 
duration between the [—voice] and [+voice] C3 contexts, whilst /m/ (again) seems 
to act as an intermediate environment. 

The duration of /ps/ before a lexical vowel, on the other hand, appears to be 
out of tune with this pattern. The voicing data reported above suggest that (the 
glottal stop preceding) a lexical vowel acts as a devoicing environment, which by 
the general inverse relationship between segmental length and voicing should 
have produced a relatively long C; + C2 sequence. However, __/V/ is the ‘short- 
est’ context of all. 


0 25 50 75 100 125 150 175 
C, + Cy duration (ms) 


Figure 6: Duration of C; + C». Mean segmental duration of /p/ (leftmost segments, light grey 
fill) and /s/ (rightmost segments, dark grey) across C3 contexts. P represents /p, t/; B signifies /b, 
d/. Error bars represent the mean + 1 standard deviation. All values in ms. 


A one-way ANOVA for C3 laryngeal specification was performed on the C; + C2 
overall duration data with /m/ and /h, V/ again included as three separate laryngeal 
specifications in addition to [+voice]. This ANOVA yields a highly significant 
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effect, F(4,804) = 25.77, p < .001, which indicates that the status of C3 does in- 
deed have an effect on the duration of preceding obstruents. 

Results of Tukey and Scheffe Post Hoc tests are summarized in Table 3. These 
indicate that there is a significant difference between the (relatively long) mean 
duration of /p + s/ before [—voice] plosives and C; + C» duration in all other con- 
texts. Interestingly, the short duration of /p + s/ before /V/ emerges as signifi- 
cantly different from the means for /p, t/, /m/, and /h/ even though, as hinted 
above, a different pattern might be expected on the basis of the C; + C, voicing 
data and the usual inverse relationship between voicing and segmental duration 
more generally. 


/p, t/ | /b, d/ /m/ /h/ /V/ 
/p, t/ ok ok ok ok 
/b, d/ * ) ns. ns. 
/m/ fo Oo ns. ial 
/h/ * ns. ns. * 
/V/ a ns. i i 


Table 3: Results of Tukey and Scheffe post hoc tests on the ANOVA for C; + C2 segmental 
duration. *: significant difference (p < .05) according to both tests; o: significant difference 
(p< .05) according to Tukey only; n.s.: difference not significant on either test 


3.5 V, duration 

There is a crosslinguistic pattern for vowels (and sonorant consonants) to be 
shorter before (pre-stressed) [—voice] obstruents than before their [+voice] coun- 
terparts. Whilst widespread, the extent of this effect is language-specific (Chen 
1970, Kluender et al. 1988). It has been documented for Dutch word-medial ob- 
struents by Slis & Cohen (1969), who report a mean difference of 25 ms between 
vowels preceding word-medial [+voice] and [—voice] obstruents. Given its asso- 
ciation with phonological voicing, both cross-linguistically and specifically for 
Dutch, preceding vowel duration should be regarded as a potential reflex of re- 
gressive voicing assimilation: the effects of lexical [voice] contrast on vowel 
length suggest that any such reflex would consist of vowel lengthening as a result 
of assimilation to [+voice] and/or vowel shortening triggered by assimilation to 
[-voice]. 

Figure 7 shows that differences in the mean duration of V; among the five 
contexts at hand are small (maximally 7 ms) and what differences emerge, are dif- 
ficult to interpret in terms of assimilation to the [voice] value of C3. A one-way 
ANOVA for C3 laryngeal specification fails to yield any effect: F(4,804) = 1.39, 
not significant, which would seem to warrant the conclusion that voicing assimila- 
tion in two-way clusters has no effect on the length of vowels preceding such 
clusters. 
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GaG.46s 


0 25 50 75 100 125 150 
Vowel duration (ms) 


Figure 7: Mean duration (ms) of V; before /ps/ followed by /p, t/, /b, d/, /m/, /h/ and /V/. 
Error bars represent the mean + 1 standard deviation 


3.6 Low-frequency spectral features 

The ability of [-voice] and [+voice] obstruents to raise and depress respectively 
the Fo (of the onset of) a following vowel is well-documented in the literature, and 
there is some evidence for similar effects of [voice] on Fj trajectories (see King- 
ston & Diehl 1994 for an overview). In theory it is possible therefore, that voicing 
assimilation manifests itself in a lowering and/or raising of Fo and F, during the 
offset of the vowel preceding an assimilated obstruent. In spite of the clear Fo dif- 
ferences following C3 consonants reported in section 3.1 there is little in the Fo 
data evincing any form of regressive assimilation. The F; values of V; near vowel 
offset show only a small effect of C3 that is (partially) consistent with regressive 
assimilation. Table 4 provides mean Fo values at 10 ms before the onset of C; for 
the 2 female subjects, as well as mean F, values at 10 ms before the onset of Ci 
pooled over all 4 subjects (recall that the carrier vowel is always /a:/). Note in this 
table how the Fo means (and standard deviations) for vowels preceding clusters 
ending in /p, t/, /b, d/, /h/, and /V/ are virtually identical, whilst the mean for __/m/ 
is similar, too.’ 


Mean F; differences at 10 ms before the onset of C; are marginally greater, nota- 
bly the 20 Hz difference between the [—voice] and [+voice] plosives. A one-way 
ANOVA for C3 laryngeal specification on the F; values at 10 ms before the onset 
of C, (clusters ending in /h, V/ excluded) reveals a weakly significant effect, 
F(2,578) = 4.23 , p < .02. This suggests that C3; may indeed have a (weak) assimi- 
latory effect on F;. 
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C3 9 Fo at C; - 10 ms N F, at C; - 10 ms N 
/p, t/ 200(40) 115 697 (93) 234 
Ib, d/ 199(39) 118 677 (87) 233 
/m/ 194 (31) 58 701 (77) 114 
/h/ 200 (35) 59 689 (81) 118 
/V/ 199 (33) 56 690 (95) 110 


Table 4: Mean F» values (female speakers only) and F; values (both in Hz) 
of V, (/a:/) at 10 ms before the onset of C;. Standard deviations in brackets 


4. Discussion 

The voicing data presented in sections 3.2 and 3.3 indicate that contrary to the 
assertions of some phonologists (Brink 1975, Cammenga & van Reenen 1980), 
regressive assimilation of voice can occur in Dutch stop + fricative + stop se- 
quences. Regardless of the exact interpretation of the results, it would be difficult 
not to attribute differences in C; + C2 voicing before /p, t/ and /b, d/ to some form 
of regressive voicing assimilation. As emphasized above in section 3.3, this con- 
clusion remains the same whether RVA is quantified in terms of mean voicing 
differences or in terms of ‘voicing categories’ along the lines of Slis (1986). It 
remains a possibility, of course, that voicing assimilation is less prevalent, or 
otherwise less noticeable, in stop + fricative + stop sequences than in two-way 
clusters, and that this has ultimately led to categorical interpretations in the litera- 
ture. 

Whilst perhaps clarifying a somewhat under-researched aspect of regressive 
voicing assimilation in Dutch, the finding that stop + fricative + stop sequences do 
in fact exhibit assimilation does not in itself have major ramifications for phono- 
logical analyses of the process. Some models predict assimilation in words like 
/fits/ + /bel/ ‘bicycle bell’ (Booij 1981, Grijzenhout & Kramer 1998), others do 
not (e.g., Lombardi 1999), but nothing of great theoretical importance hinges on 
this. 

A perhaps more problematic issue lies in the detail of the voicing data and also 
in the patterning of segmental duration. As mentioned before, the means of C; + 
C, voicing duration before [—voice] plosives, [+voice] plosives and /m/ suggest a 
tripartite pattern of assimilation, in which the nasal acts as an ‘intermediate’ con- 
text between the two obstruent classes. This pattern of voicing values can be in- 
terpreted as involving assimilation to both /b, d/ and /m/ with /p, t/ representing 
the neutral category; alternatively, it might be seen as ‘[voice]-symmetric’ assimi- 
lation to both /p, t/ and /b, d/, with the nasal representing the neutral category. 
However, as most recent phonological analyses treat [voice] in Dutch as monova- 
lent (e.g., Lombardi 1994, 1999, Iverson & Salmons 1999), they are unable to de- 
rive tripartite patterns of assimilation. 

Probably the easiest way out of this conundrum for proponents of these mod- 
els is to maintain that [voice] is phonologically monovalent in Dutch and that any 
phonetic distinctions between __/p, t/ and __/m/ are merely the result of a ‘low- 
level phonetic process’. An analysis in this vein could run roughly as follows: se- 
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quences such as /ps + b/ are subject to assimilation to become [+voice] throughout 
in the phonology to become /bzb/. A natural phonetic interpretation of sequences 
of this type then assigns a relatively great amount of voicing to C; + C» (perhaps 
subject to some spontaneous devoicing later on). Clusters such as /ps + p/ and /ps 
+ m/ are left untouched by the phonology and passed on to the phonetics, which 
assigns the initial stop and fricative identical amounts of voicing in both environ- 
ments. Some minor readjustments then somehow produce a minor increase in C, 
+ C, voicing before /m/ (or a decrease before /p/). 

The remainder of this paper represents an attempt to show that the analysis 
sketched here probably provides the best fit with the data, albeit with one key 
modification such that any three-way cluster of C; (+ C2) + C3 is passed on to the 
phonetics as is, and that any C3-induced changes in the voicing and duration of 
preceding obstruents stem from the coarticulation of laryngeal gestures. The es- 
sence of this analysis, which is largely similar to that of Ernestus (2000: section 
7.4), is therefore that all of Dutch regressive voicing assimilation is a ‘low-level 
phonetic process’. 

The first step in this argument is to clarify some important terminology. I use 
coarticulation to refer to the observable results of articulatory strategies that man- 
age the transitions between sounds produced in sequence. The results in question 
include phenomena such as the partial nasalization of vowels adjacent to nasal 
stops found in English and other languages (Cohn 1993); models such as Articula- 
tory Phonology (Browman & Goldstein 1986, 1992) offer formal frameworks to 
capture the (hypothesized) articulatory strategies underpinning coarticulation ef- 
fects. 

There is evidence that the amount of coarticulation between two given sounds 
can be dependent on speech rate and prosody (Solé & Ohala 1991, De Jong et al. 
1992, Byrd & Saltzman 1998), and that patterns of coarticulation are in part lan- 
guage-specific (e.g. Clumeck 1976), but the phenomenon itself is arguably a uni- 
versal aspect of phonetic implementation (see Farnetani 1997). Consequently, in- 
voking coarticulation as the mechanism underlying voicing assimilation in Dutch 
does not entail adding an otherwise superfluous module to the phonology- 
phonetics interface: it merely involves relocating the source of RVA to a different 
part of the interface. 

There is a range of instrumental studies, in particular those by L6fqvist and his 
associates, that document aspects of laryngeal, or more specifically glottal, coar- 
ticulation in obstruent sequences (e.g., Yoshioka et al. 1982, L6fqvist & Yoshioka 
1984, L6fqvist et al. 1989, Munhall & L6fqvist 1992; see Hoole 1999 for an over- 
view). For example, Munhall & L6éfqvist (1992) show how at slow speaking rates, 
the /s#t/ cluster in the phrase <Kiss Ted> tends to be articulated with two separate 
abduction gestures (opening and closing movements of the glottis) which gradu- 
ally merge as speaking rate increases, initially producing a single gesture with two 
abduction peaks, and producing one single-peaked gesture at the highest rates. An 
earlier study (Yoshioka et al. 1982) presents (more limited) data showing the 
merging of abduction gestures in Dutch obstruent clusters. From the present per- 
spective it is perhaps unfortunate that most if not all of the work on glottal coar- 
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ticulation in obstruent clusters investigates sequences of [—voice] sounds rather 
than mixed [voice] clusters, but it nevertheless establishes that (aspects of) laryn- 
geal articulation are subject to considerable coarticulation effects, in much the 
same way as other areas of segment production. 

In the absence of articulatory data, the common practice of distinguishing be- 
tween phonetic and phonological processes on the basis of their systematically 
gradient or discrete mode of application provides a relatively useful test to classify 
assimilation rules: their capacity for systematically neutralizing (lexical) phono- 
logical distinctions among the sounds they target. If a rule (all but) erases phonetic 
distinctions between, say, underlyingly [+voice] and [—voice] obstruents whenever 
it applies, it seems safe to infer that it represents a phonological process. If, on the 
other hand, the application of a rule tends to leave a phonetic residue of the under- 
lying phonological distinction, it may represent a coarticulation process. Unfortu- 
nately, this test is unavailable for the case at hand as word-final [voice] contrast is 
(near-)neutralized on independent grounds in Dutch. 

Nevertheless, the data reported in this paper are largely consistent with a view 
of RVA as arising from anticipatory laryngeal coarticulation. Since this view 
yields a uniform account of the observed behaviour of /ps/ on the basis of an in- 
dependently motivated mechanism, I think it should be preferred over a hybrid 
account which attributes voicing assimilation in 4 out of 5 environments to coar- 
ticulation (or some other phonetic process) but which treats voicing and duration 
effects in the remaining environment as the result of a phonological rule (which 
must be postulated separately). The following sections outline how, under certain 
assumptions, the behaviour of /ps/ clusters in each of the 5 environments is ex- 
pected on grounds of anticipatory coarticulation. 


4.1 [-voice] and final obstruents 

Most recent models of (West-Germanic) laryngeal phonology treat the single 
series of word-final obstruents found in Dutch as phonologically identical to the 
contrastively voiceless obstruents found initially and medially. According to e.g., 
Lombardi (1994, 1999) and Iverson & Salmons (1999) this single class of contras- 
tively voiceless and neutralized (voiceless) stops represents the laryngeally un- 
marked category. By contrast, Ernestus (2000) claims that contrastively voiceless 
and neutralized obstruents represent two distinct phonological and phonetic 
classes: the former phonologically marked and actively devoiced phonetically; the 
latter phonologically unmarked and phonetically underspecified for voicing. Both 
classes are distinct from phonologically marked [+voice] obstruents, which are 
actively voiced at the phonetic level (see also Hsu 1996 on Taiwanese, and 
Steriade 1997). 

Ernestus’s principal argument for this three-way classification of Dutch ob- 
struents is that the phonetic voicing of final obstruents is much more dependent on 
phonetic context and other (extralinguistic) factors than the voicing of contras- 
tively voiceless or voiced obstruents. In other words, with respect to phonetic 
voicing the neutralized final obstruents of Dutch behave much like the [+voice] 
plosives of English, which some have long regarded as passively (de)voiced (cf. 
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Harms 1973). For example, unlike their contrastively voiceless counterparts, final 
obstruents may be (audibly) voiced before sonorant consonants. 

Now the claim that Dutch final obstruents are phonetically underspecified for 
voicing (and presumably other phonetic cues to [voice]) and therefore subject to 
passive voicing and devoicing hardly contradicts the approach of Lombardi (1994, 
1999), Iverson & Salmons (1999), and similar models. In fact, variants of this ap- 
proach that tie phonological unmarkedness directly to phonetic underspecification 
(Harris 1994) predict precisely that this is the case. It is rather the claim that the 
contrastively voiceless obstruents are actively devoiced that is potentially prob- 
lematic for these models. 

Fortunately, there is some instrumental evidence to support this idea. L6fqvist 
et al. (1989) conducted an electromyographic (EMG) study of cricothyroid (CT) 
activity during the production of Dutch and English obstruents. They report in- 
creased levels of CT activity relative to the surrounding vowels during both 
American English [—voice] obstruents, which are quite commonly regarded as ac- 
tively devoiced, and Dutch voiceless obstruents. Contraction of the cricothyroid 
muscle results in a stiffening of the vocal folds, and L6éfqvist et al. (1989) argue 
that this can be seen as a devoicing strategy since stiff(er) vocal folds vibrate less 
easily. 

Moreover, they claim that the timing of the peak in CT activity is such that the 
resulting increase in vocal fold tension peaks around the point of consonantal re- 
lease. This is consistent with the idea that it is CT activity that is mainly responsi- 
ble for the raised Fo after the offset of [-voice] obstruents. L6fqvist et al. (1989) 
mention two further observations in support of the hypothesized link between CT 
activity and Fo raising. First, there is a statistically significant correlation between 
the amount of CT activity during the production of a [—voice] obstruent and the Fo 
of the first period of the following vowel; and second, the absence of a significant 
Fo difference at the offset of the lexical affricates /t{/ and /d3/ of one of their Eng- 
lish speakers coincides with the absence of raised CT activity during the [—voice] 
affricate. 

Note that Léfqvist et al. (1989) are keen to point out that any differences in CT 
activity between [+voice] and [—voice] obstruents are due to increased activity in 
the latter rather than decreased activity in the former, which exhibit no substantial 
deviation from levels of CT activity in the surrounding vowels. Assuming that 
there indeed is a link between CT activity in obstruent production and Fo levels in 
following vowels, this suggests that Fo differences at the offsets of (Dutch and 
English) [+voice] and [—voice] obstruents are solely due to Fo raising after the lat- 
ter. This tallies neatly with the observation in 3.1 that Fo is roughly the same at the 
offset of /b, d/ and /m/ (cf. Figure 3). 

The CT study of L6fqvist et al. (1989) can hardly be treated as decisive given 
the fact that it involved only one Dutch-speaking subject, and also in light of an 
earlier study by Collier et al. (1979), which fails to find increased CT activity dur- 
ing the production of Dutch [—voice] obstruents. However, some additional sup- 
port for the idea that these obstruents are actively devoiced may be gleaned from 
the paper by Yoshioka et al. (1982) already mentioned above. EMG data on poste- 
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rior cricoarytenoid (PCA) activity during the production of Dutch [— voice] ob- 
struents shows small peaks associated with word-initial /p/, whilst the correspond- 
ing glottal transillumination data shows a small abduction gesture (no data are 
provided on /t/ or /k/). Since PCA contraction contributes to vocal fold abduction, 
it is easily construed as a devoicing strategy, and consequently the presence of 
increased PCA activity during the production of [—voice] stops may well be a sign 
of active devoicing.” 

Interestingly, Yoshioka et al. (1982:31) comment on the “negligible size” of 
any increase in PCA activity and glottal abduction during the production of word- 
final /p/. They attribute this observation to “the glottalization of a stop sound in 
this position”, but unlike (several dialects of) English, Dutch does not normally 
glottalize final stops. The virtual absence of PCA activity and glottal abduction 
during word-final /p/ is therefore perhaps better attributed to phonetic underspeci- 
fication of [voice], or alternatively to phonetic reduction of the already small ab- 
duction gesture associated with word-initial /p/. 

Given that neutralized final and contrastively voiceless obstruents belong to 
two distinct phonetic categories, it is unsurprising that the latter trigger some de- 
gree of devoicing in the former relative to sounds that are not actively devoiced, 
such as [m]. The active devoicing measures associated with [—voice] word-initial 
obstruents is likely to carry over to some extent into preceding voice-underspe- 
cified obstruents, and it does not seem implausible that this clips the voicing of 
these underspecified sounds somewhat relative to passively voiced environments.° 

Moreover, to the extent that anticipatory coarticulation of actively devoiced 
word-initial plosives has an effect on the frication interval of preceding [voice]- 
underspecified word-final fricatives, this effect is likely to positive. The genera- 
tion of sustained turbulence noise at an oral constriction is critically dependent on 
high transglottal airflow, and this motivates the large glottal abduction gestures 
typically associated with voiceless fricatives (Stevens et al. 1992, Stevens 1998). 
Consequently, coarticulating a [voice]-underspecified fricative with a segment 
that is itself associated with a glottal abduction gesture (albeit a small one) may 
well increase the duration of the interval during which transglottal airflow is 
above the critical threshold for noise generation. 

In sum then, assuming that word-final neutralized obstruents are underspeci- 
fied for [voice], coarticulation with a following actively devoiced obstruent is 
likely to result in a decreased amount of voicing, and in the case of fricatives, a 
relative increase in duration. 


4.2 [+voice] obstruents 

Aerodynamic models of the vocal tract indicate that a number of supplemen- 
tary articulatory strategies are required in addition to vocal fold adduction to pro- 
duce phonetically voiced stops post-pausally and after another obstruent (Ohala 
1983, Westbury & Keating 1986, Stevens 1998). A number of the (possible) strat- 
egies involved (larynx lowering, tongue root advancement, raising of the soft pal- 
ate, relaxing the tissue lining the vocal tract walls) are aimed at maintaining a 
transglottal pressure difference that is sufficient for voicing by expanding the cav- 
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ity behind the oral occlusion; others (decreasing vocal fold tension) aim to lower 
the transglottal pressure threshold below which voicing is impossible (Ladefoged 
1973, Stevens 1998). 

I am not aware of any instrumental studies documenting the deployment of 
these active voicing measures in the production of Dutch [+voice] stops, but it 
seems safe to assume that at least some of them are deployed by speakers of 
(southern and western varieties of) Dutch. 

As a result, anticipatory coarticulation of actively voiced plosives such as 
Dutch /b, d/ is likely to improve the conditions for voicing during preceding 
[voice] underspecified obstruents, and therefore to increase the amount of voicing 
relative to other contexts. Whether the broad magnitude of the voicing effect ob- 
served in sections 3.2 and 3.3 follows from the active voicing measures deployed 
by speakers of Dutch is a matter that can only be resolved on the basis of articula- 
tory data and vocal tract modelling, but it seems that in qualitative terms the effect 
is as expected on the basis of laryngeal coarticulation. 

In addition, a coarticulation-driven account predicts the relatively short dura- 
tion of the C2 fricative before [+voice] plosives, which accounts for practically all 
of the reduction in C; + C2 overall duration in this environment. Actively voiced 
stops are minimally accompanied by a glottal adduction gesture, and any anticipa- 
tory coarticulation of this gesture would lead to some amount of reduction in the 
temporal extent and peak glottal width of the abduction gesture associated with a 
preceding fricative. This would in turn reduce the window during which transglot- 
tal airflow is sufficient for the production of turbulence noise at the alveolar ridge, 
and consequently reduce the duration of the frication interval of /s/. 

In other words, it is unnecessary to postulate a phonological rule spreading 
[+voice] to derive an increase in the amount of C; + C2 voicing and a decrease in 
(C; +) C2 duration before b, d, even though it remains to be seen if the magnitude 
of the voicing effect observed above can be derived from coarticulation alone. 


4.3/m/ 

The Dutch bilabial nasal is generally produced fully voiced, but in the absence 
of contrastive voicing in the Dutch nasal inventory, it seems perfectly reasonable 
to assume that this is the result of spontaneous voicing, and thus that /m/ is not 
accompanied by any local laryngeal gestures. Consequently, there is nothing at 
the laryngeal level for anticipatory coarticulation to spread into the preceding ob- 
struents, and /ps/ +/m/ clusters are subject to spontaneous voicing (during the 
early stages of the obstruent clusters and during the nasal) and spontaneous de- 
voicing (during the remainder of the obstruent cluster) throughout. 

This means that a coarticulation-based account predicts that C; + C2 voicing 
and duration values for /ps/ + /m/ sequences are roughly intermediate between 
those for /ps/ + /b, d/ sequences, in which the obstruent cluster is likely to acquire 
some amount of active voicing, and those for /ps/ + /p, t/ clusters, in which C; + 
C) is likely to acquire some amount of active devoicing. 
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4.4 /h/ 

The Dutch glottal fricative has been represented as phonologically [—voice] by 
some phonologists (Trommelen & Zonneveld 1979) but as [+voice] by others 
(Booij 1981), and has been described as phonetically voiced [fi] (e.g., Gussenho- 
ven 1999). However, the key to understanding Dutch /h/ and its possible coarticu- 
latory effects on a preceding /ps/ sequence is not to read too much into these la- 
bels and/or to treat it on a par with oral consonants, but to take a closer look at its 
phonetic realization. 

According to Rietveld & Loman (1985), quoting Slis & Damsté (1967), trans- 
illumination evidence suggests that intervocalic /h/ is characterized by a glottal 
abduction gesture comparable to that of (contrastively) “voiceless intervocalic 
fricatives and plosives”, but they point out that vocal fold vibration is possible 
during (much of) this abduction gesture due to relatively favourable aerodynamic 
conditions (i.e., the lack of increasing intra-oral pressure). As would perhaps be 
expected on the basis of this information, electroglottographic and acoustic data 
collected by Rietveld & Loman (1985) indicates that the voicing of Dutch /h/ is 
subject to a great deal of contextual variability. For example, they report that ut- 
terance-initially and after voiceless /s/ the glottal fricative is realized with an ini- 
tial voiceless interval with a mean duration of around 40 ms. After /n/ or a vowel, 
on the other hand, this voiceless interval is nearly always absent. 

Data from the present experiment are in broad agreement with these findings. 
In C3; position /h/ is realized with an initial interval of voiceless aspiration in 94% 
of cases; the mean duration of this interval is 34 ms. Interestingly, in the tokens 
produced by the two female subjects the mean Fo at 10 ms after the onset of voic- 
ing is 204 Hz, and this rises to 221 Hz at 50 ms. This is much lower than the val- 
ues found after the offset of [—voice] /p, t/ and lower even than the values found 
for /m/ and /b, d/ (see figure 3; similarly low values were obtained for the male 
subjects). Even if the offset of /h/ could not be reliably segmented, it is safe to as- 
sume that measuring Fy at 10 ms after the onset of voicing in /h/ is hardly equiva- 
lent to measuring Fo 10 ms after the offset of an oral consonant and not too much 
weight should therefore be attached to the exact Fo value. Nevertheless, in terms 
of its effects on Fo, /h/ should probably be grouped with /b, d/ and /m/ rather than 
with the [—voice] plosives, and extrapolating to underlying CT activity this im- 
plies that /h/ should not be regarded as actively devoiced. Since there are no indi- 
cations (or aerodynamic grounds on which it is plausible) that /h/ is actively 
voiced either, the most straightforward analysis of this sound treats it as spontane- 
ously voiced. 

The key to its effect on the voicing of a preceding obstruent cluster, then, 
should be sought not in active devoicing measures but simply in the anticipatory 
coarticulation of its abduction gesture. The glottal fricative itself may be partially 
or wholly voiced in spite of it, but when an oral constriction is superimposed on 
glottal abduction in the absence of any articulatory strategies promoting voicing 
(as in voiced fricatives) the outcome is devoicing. Consequently, anticipatory 
coarticulation of the abduction gesture associated with /h/ is likely to result in a 
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decrease in voicing in a preceding [voice]-underspecified obstruent cluster, even if 
the trigger is not itself realized as fully voiceless. 

A (potential) problem with this account is that the logic of section 4.1 would 
seem to predict a longer duration for the /s/ immediately preceding the /h/ than the 
value obtained from the experiment, which is 9 ms shorter than the duration of /s/ 
before /p, t/ rather than in the same range, or longer. I currently have no solution 
to this problem. 


4.5 /V/ 

The behaviour of /ps/ before lexical vowel is probably the clearest clue that 
anticipatory laryngeal coarticulation plays an important role in shaping the dura- 
tion and voicing of /ps/ clusters across the contexts studied here. 

As mentioned above, lexical vowels were generally preceded by a glottal stop. 
The glottal compression gesture involved in the production of a glottal stop is a 
well-known inhibitor of vocal fold vibration. It seems likely therefore that the 
coarticulation of this gesture has a negative effect on the voicing of preceding ob- 
struents. At the same time, coarticulation of glottal compression is likely to cause 
a shortening of /s/ relative to other environments by virtue of the same mechanism 
that shortens the fricative before [+voice] stops: reduction of the size and temporal 
extent of the glottal abduction gesture that is critical to the production of turbu- 
lence noise. 

Thus, a coarticulation-based account provides a straightforward explanation of 
the cooccurrence of two effects (devoicing and shortening) that might seem puz- 
zling from a more phonological approach to the realization of [voice] contrast. 


5. Conclusions 

This paper has attempted to defend two principal claims. The first simply 
holds that regressive voicing assimilation occurs in Dutch fricative + stop + frica- 
tive sequences. The second claim is that regressive voicing assimilation, at least in 
these clusters, reflects the effects of anticipatory coarticulation rather than the ef- 
fects of a phonological rule and a set of separate, ‘low level’, phonetic processes. 

The first claim receives solid support from the results of the experiment re- 
ported here. It is probably worthwhile to stress again that regardless of observa- 
tions concerning the remaining 3 contexts, the clear difference in C; + Co voicing 
between [+voice] and [—voice] environments evinces regressive assimilation in 
production, even by the criteria applied by Slis 1986. This finding does not in it- 
self have far-reaching ramifications for phonological theory, but it roundly con- 
tradicts the claims of Brink (1975) and Cammenga & van Reenen (1980) with re- 
gard to assimilation in three-way obstruent clusters, to the extent at least that these 
claims pertain to production rather than perception. 

The second claim is largely consistent with the detail of the C; + Cs voicing 
and duration data and, given that anticipatory coarticulation is available anyhow 
as part of the linguistic phonetics as a source of ‘low-level’ phonetic processes, it 
entails a more parsimonious model of Dutch RVA than one that employs an addi- 
tional rule of RVA in the phonology. In addition, the phonetic manifestation of 
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Dutch RVA in the clusters investigated here exhibits similarities with assimilatory 
patterns in English obstruent sequences investigated by Jansen (2004, in press). 
The latter arguably represent coarticulatory processes because they are non- 
neutralizing and in particular because they do not affect vowel length distinctions 
between underlyingly [+voice] and [—voice] obstruents. Nevertheless, given the 
absence of key articulatory data, this second claim must remain far more specula- 
tive, and no doubt contentious, too. 

Note, finally, that even if the analysis proposed here proves correct for voicing 
assimilation in three-way clusters, this does not entail that it extends to RVA in 
other contexts. Claims to this effect can only be substantiated on the basis of addi- 
tional phonetic research, and I have little doubt that such research would bring to 
light further complications of textbook views of voicing assimilation in Dutch and 
elsewhere. 


Acknowledgements 

This paper is a considerably reworked version of chapter 7 of my University of Groningen PhD 
dissertation (Jansen 2004). The work reported here was funded by award 200-50-068 from the 
Netherlands’ Organization for Scientific Research (NWO) and a research stipend from the Univer- 
sity of Tiibingen. Thanks are due, first and foremost, to my test subjects. For their comments on 
(earlier versions of) this work I am grateful to two reviewers, to John Nerbonne, Dicky Gilbers, 
Zoé Toft and audiences at the 2003 Voicing in Dutch workshop at the University of Leiden, the 
2001 spring meeting of the Linguistics Association of Great Britain, and at the Max Planck Insti- 
tute for Psycholinguistics in Nijmegen. Any remaining errors are my own. 


Notes 

1. The Dutch glottal fricative has been treated as phonologically [—voice] (Trommelen & Zonne- 
veld 1979) but, as a reviewer points out, is sometimes regarded as voiced [fi] phonetically, at least 
in some contexts (cf. Gussenhoven 1999). The voiced/voiceless status of the Dutch glottal fricative 
is discussed in some detail in section 4 below; meanwhile I have opted to use /h/ in broad, ‘pho- 
nemic’, transcriptions. 


2. Subjects were tested on an additional set of stimuli during the experiment. The target consonant 
sequences in this set combined an initial /k/ (orthographic <k>) and medial /s/ with a /p, t, b, d/ in 
C3 position. Data from this part of the experiment is excluded from the discussion below as the 
behaviour of /ks/ clusters was not found to be significantly different from that of the corresponding 
/ps/ sequences. 


3. Strictly speaking it is not clear whether these morphemes are [—voice] /s/ or unspecified for 
[voice] but for typographical reasons I will represent them as /s/. 


4. The corresponding values for the 2 male subjects exhibit a similar pattern: /p, t/ = 162 Hz (25); 
/b, d/ = 163 Hz (26); /m/ = 168 Hz (27); /h/ = 163 Hz (27); /V/ = 163 Hz (28). 


5. The [-voice] fricative /s/ shows much higher levels of PCA activity, but in light of the glottal 
abduction required for the production of (sibilant) fricatives (see below), this is not necessarily a 
sign of active devoicing. 


6. Note, incidentally, that it is not uncommon for short lag VOT [-voice] plosives of the type 
found in Dutch to trigger RVA in preceding [+voice] obstruents, as for example in French (Dell 
1995), Yiddish (Katz 1987), and Hungarian (Kenesei et al. 1998, Siptar & Térkenczy 2000; see 
Toft & Jansen 2003, Jansen 2004 for preliminary phonetic data). This observation has always been 
somewhat of an embarrassment to monovalent models of [voice] contrast (cf. Wetzels & Mascaré 
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2001), and may be an indication that [-voice] short lag VOT stops are actively devoiced more 
generally. 
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In Dutch, all morpheme-final obstruents are voiceless in word-final position. As a 
consequence, the distinction between obstruents that are voiced before vowel-initial 
suffixes and those that are always voiceless is neutralized. This study adds to the ex- 
isting evidence that the neutralization is incomplete: neutralized, alternating plosives 
tend to have shorter bursts than non-alternating plosives. Furthermore, in a rating 
study, listeners scored the alternating plosives as more voiced than the non- 
alternating plosives, showing sensitivity to the subtle subphonemic cues in the 
acoustic signal. Importantly, the participants who were presented with the complete 
words, instead of just the final rhymes, scored the alternating plosives as even more 
voiced. This shows that listeners’ perception of voice is affected by their knowledge 
of the obstruent’s realization in the word’s morphological paradigm. Apparently, 
subphonemic paradigmatic levelling is a characteristic of both production and per- 
ception. We explain the effects within an analogy-based approach. 


1. Introduction 

In many languages, morpheme-final obstruents alternate between voiced and 
voiceless, depending on their position in the word. Such obstruents are generally 
voiced before vowel-initial suffixes, and voiceless elsewhere (unless they are sub- 
ject to regressive voice assimilation). To give an example, the final obstruent of 
the Dutch morpheme mand ‘basket’ is voiced in the plural [mandoen] manden and 
voiceless in the singular [mant] mand (in this article we only give broad transcrip- 
tions). A word’s morphological paradigm, which we define as consisting of all 
other words containing the word’s stem, may thus show alternation in the voice 
specification of the stem-final obstruent. 

This alternation of voice within morphological paradigms raises the question 
of whether the voiced stem-final obstruents affect the acoustic characteristics and 
interpretation of their intraparadigmatic voiceless counterparts. In other words, 
does the presence of [manden] manden in the morphological paradigm of mand 
affect the production and interpretation of the [t] of [mant] mand? 

In generative grammar, the alternation between voiced and voiceless obstru- 
ents is traditionally accounted for by means of underlying forms. A morpheme- 
final obstruent that is voiced before vowel-initial suffixes is also voiced in the un- 
derlying form. Thus, the underlying form of [mant] is /mand/ because of the 
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voiced [d] in the plural [mandon]. Obstruents that are voiced in the underlying 
form are devoiced in syllable-final position by a rule or constraint of Final De- 
voicing (e.g., Booij 1995), which turns the /d/ of the singular /mand/ into [t]. 
Within generative grammar, the question of the effect of voice alternation on pro- 
duction and perception can thus be considered as a question of the effect of the 
underlying voice specification. 

In this paper, we will not refer to the notion of underlying form. Recent studies 
have shown that the mental lexicon contains form representations for a great many 
words, including inflected forms (e.g., Jackendoff 1975, Baayen, Dijkstra & 
Schreuder 1997, Alegre & Gordon 1999, Baayen, McQueen, Dijkstra & Schreu- 
der 2003). Thus, it contains the singular [mant] as well as the plural [mandon]. 
Given the storage of plurals, the realization of the stem-final obstruent in the plu- 
ral need not be stored as a diacritic on the stem. It is already stored in the repre- 
sentation of the plural itself. We tentatively assume that all forms that are stored 
in the mental lexicon are surface representations, which reflect the actual realiza- 
tion of the forms, and that the lexicon does not contain abstract underlying forms. 
We refer to stem-final obstruents that surface as voiced in some forms of their 
paradigm as “alternating obstruents” (instead of “underlyingly voiced” obstru- 
ents), and we regard possible effects of voice alternation as intraparadigmatic ef- 
fects, which the different forms of the same morphological paradigm may exert 
upon each other. 

Previous studies have documented an effect of intraparadigmatic voice alterna- 
tion in production for several languages, including Dutch, German, Polish, and 
Catalan (e.g., Dinnsen & Charles-Luce 1984, Port & O’Dell 1985, Slowiaczek & 
Dinnsen 1985, Port & Crawford 1989, Charles-Luce 1993, Warmer, Jongman, 
Sereno & Kemps 2004, Ernestus & Baayen 2006). Although all word-final ob- 
struents are generally voiceless in these languages, the alternating obstruents tend 
to have more acoustic characteristics of voiced obstruents than non-alternating 
obstruents, which are always voiceless. These alternating obstruents tend to be 
shorter, to be realized with vocal fold vibration during a longer period, and to be 
preceded by longer vowels. Thus, the [t] of a word such as [mant] mand ‘basket’ 
tends to have more acoustic characteristics of voiced obstruents than the [t] of a 
word such as [krant] krant ‘newspaper’, which has the plural [krantoen] kranten. 
Hence, the neutralization at word-final position between alternating and non- 
alternating obstruents is incomplete. In what follows, we will refer to voiceless 
obstruents that possess some acoustic characteristics of genuine voiced obstruents, 
such as a relatively short duration or a relatively long preceding vowel, as weakly 
voiced. 

For Dutch, the realization of alternating and non-alternating obstruents in ex- 
isting words has been investigated by Warner et al. (2004). They carried out a 
production experiment with 15 native speakers of Dutch, and found a significant 
difference in vowel duration of 3.5 ms. Emestus & Baayen (2006) carried out a 
production experiment with pseudowords. The alternating/non-alternating char- 
acter of final plosives was indicated by their spelling: in Dutch, alternating plo- 
sives are represented by graphemes for voiced phonemes ([b] or [d]) in all posi- 
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tions in the word, while non-alternating plosives are represented by graphemes for 
voiceless phonemes ([p] or [t]). Ernestus and Baayen found a statistically signifi- 
cant difference between the alternating and non-alternating final plosives with re- 
spect to the duration of their release noise (their burst and the following period of 
aspiration). The release noises of plosives represented as voiceless were on aver- 
age 23 ms longer than the release noises of plosives represented as voiced. 

Several authors, however, argue that incomplete neutralization is not a charac- 
teristic of spontaneous speech, but would be induced by spelling, especially when 
speakers are asked to read aloud minimal word pairs such as Rat—Rad (e.g., 
Fourakis & Iverson 1984, Warner, Good, Jongman & Sereno 2006). Thus, Foura- 
kis & Iverson (1984) reported a series of experiments in which speakers showed 
incomplete neutralization when reading aloud minimal word pairs, but failed to do 
so when they repeated infinitives and produced the corresponding past-tense 
forms and past participles. In contrast, Emestus & Baayen (2006) revealed that 
speakers may also show incomplete neutralization when they read aloud lists of 
pseudowords without detecting the minimal word pairs and without being aware 
of the purpose of the experiment. Moreover, Dinnsen & Charles-Luce (1984) 
showed that incomplete neutralization is also present in Catalan minimal word 
pairs that do not reflect the difference between alternating and non-alternating ob- 
struents in their spelling. 

What is important for the present study is not so much whether speakers pro- 
duce incomplete neutralization also in spontaneous conversations, but that listen- 
ers are sensitive to the fine acoustic differences induced by incomplete neutraliza- 
tion. Although the acoustic differences between alternating and non-alternating 
obstruents are generally small, listeners are able to take advantage of these subtle 
differences. They assign the correct spelling at significantly above chance level to 
the members of minimal word pairs that differ from each other only in the alter- 
nating/non-alternating character of the final obstruent (e.g., Port & O’Dell 1985, 
Port & Crawford 1989, Warner et al. 2004). For instance, when Dutch listeners 
hear [rat], they assign, at just above chance level, the intended meaning raad ‘ad- 
vice’ with the plural [radon], or raat ‘comb’ with the plural [raton]. They opt 
slightly more often for raad when the final obstruent is weakly voiced, and for 
raat when the obstruent is completely voiceless. 

Additional evidence for listeners’ sensitivity to incomplete neutralization 
comes from listeners’ choices of allomorphs for pseudowords. In Dutch, the 
choice between the past tense allomorphs —de [da] and —te [ta] depends on the al- 
ternating/non-alternating character of the stem-final segment. If the segment is 
always realized as voiceless, the appropriate allomorph is —te, otherwise it is —de 
(see also Zonneveld, this volume). Ernestus & Baayen (2003) showed that when 
speakers do not know how the final obstruent is realized in morphologically re- 
lated words, that is, when they have no information about whether it is an alternat- 
ing obstruent, they tend to base their choice between —te and —de on the word’s 
phonological similarity neighbourhood. If most words ending in the same type of 
rime take —te, speakers tend to choose —te, and if most words take —de, the major- 
ity of speakers choose —de. Ernestus & Baayen (2006) found that listeners also 
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choose —de more often when the final obstruent is realized with weak voicing. 
This shows that listeners base their choice in addition on the detailed acoustic 
characteristics of the words. They are sensitive to incomplete neutralization even 
when this is not a requirement for the task that they are performing. 

Lahiri, Jongman & Sereno (1990) also studied intraparadigmatic effects in the 
processing of voice. They carried out an experiment in which listeners of Dutch 
heard a verbal stem followed by the clitic pronoun d’r ‘her’ (the prime), and then 
performed auditory lexical decision on the same verbal stem in isolation (the tar- 
get), realized with a voiceless final obstruent. The consonant cluster in the prime, 
consisting of the final obstruent of the verbal stem and the initial consonant of the 
clitic pronoun, was realized as voiceless (e.g., [kistar] kies d’r ‘choose her’), or as 
voiced ([kizdor]). Both realizations are well-formed in Dutch (e.g., Zonneveld 
1983). Listeners appeared to respond faster to target words ending in alternating 
obstruents when these obstruents were voiced in the prime, and to target words 
ending in non-alternating obstruents when these obstruents were voiceless in the 
prime. Unfortunately, Lahiri et al. do not report details on the acoustic characteris- 
tics of the target words, and it is therefore not clear to which extent the partici- 
pants’ behaviour might have been affected by incomplete neutralization. In addi- 
tion, the study reports no statistics, and it is therefore not clear for which obstru- 
ents the observed differences are actually statistically significant, and even 
whether any of the reported differences reach significance. 

In another study, Jongman, Sereno, Raaijmakers & Lahiri (1992) reported an 
experiment suggesting that the interpretation of a vowel as phonologically long or 
short depends on the alternating/non-alternating character of the following voice- 
less final obstruent. Listeners seem to attribute part of the length of a vowel to 
weak voicing if the following obstruent is alternating. The study does not report 
any statistics, but it does report for each participant the crossover boundaries (in 
ms) for each of the three studied word pairs, which differ in the alternating/non- 
alternating character of the final obstruents. Our analysis of these data show, un- 
fortunately, that the differences between the word pairs fail to reach significance 
at the five-percent level (effect of word pair in an analysis of variance: F(2, 42) = 
2.70; p = 0.08), so that it is uncertain what conclusions might be drawn from this 
study. 

Summing up, previous studies have shown that alternating obstruents tend to 
be weakly voiced, at least in careful speech, and that listeners are sensitive to the 
subtle cues for weak voicing in the acoustic signal. In other words, the intrapara- 
digmatic realizations of an obstruent affect its production in word-final position 
by inducing incomplete neutralization, and they affect word comprehension by the 
mediation of incomplete neutralization in the acoustic signal. 

The present study addresses the question of whether voice alternation might 
affect speech perception over and above mediation by incomplete neutralization in 
the acoustic signal. That is, does the listeners’ knowledge about a word’s morpho- 
logical paradigm codetermine their percept of this word? Thus, does the lexical 
information about the plurals of [mant] and [krant] ([mandon] and [kranton]) 
cause listeners to perceive the [t] of [mant] as more voiced than the [t] of [krant]? 
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We report a transcription experiment in which listeners, who are not phoneti- 
cally trained, rated word-final obstruents on a five point scale as voiceless or 
voiced. Several authors (e.g., Vieregge 1987, Cucchiarini 1993) have argued that 
the phonetic transcription of speech sounds may be affected by their orthographic 
representations, the phonotactics of the language, and semantics. In addition, 
Kemps, Ernestus, Schreuder & Baayen (2005) have shown that the transcription 
of reduced word forms is affected by their unreduced counterparts. Listeners re- 
port the presence of [I] in words ending in the suffix —lijk [-lok], even when this 
suffix is reduced to [k]. In the present study, we investigated whether phonetic 
transcriptions made by naive participants might also be affected by lexical infor- 
mation about the segments’ realizations in the word’s morphological paradigm. 
Such information is of no use in a transcription task, where listeners have to base 
their judgments exclusively on the acoustic signal. In fact, intraparadigmatic ef- 
fects, if present, would give rise to less accurate phonetic transcriptions. Hence, if 
we find such intraparadigmatic effects for naive participants, these effects must be 
automatic, that is, not available to participants for conscious, strategic control, and 
therefore a characteristic of everyday speech perception. 

We opted for a rating task, instead of a traditional transcription task in which 
listeners represent sounds by IPA-symbols, because a rating may reveal more sub- 
tle differences in the listeners’ percept of the voicing of alternating and non- 
alternating obstruents. We presented one group of listeners with full words, and 
another group with the final rimes of the same words. Since incomplete neutrali- 
zation may be present in the acoustic signal, we may expect a difference between 
the ratings for the alternating and non-alternating obstruents by all listeners. Cru- 
cially, the listeners hearing the words could identify all words presented, and their 
judgments could therefore also be affected by lexical information about the in- 
traparadigmatic realizations of the final obstruents. In contrast, the listeners hear- 
ing the rimes could identify the presented words only in a smaller number of 
cases, and a lexical effect on their judgments, if present, would necessarily be 
smaller. Hence, if intraparadigmatic realizations affect the perception of voice, we 
may expect a difference between the two groups of listeners, such that listeners 
hearing full words rate alternating obstruents as more voiced. 

We expect intraparadigmatic effects to be smaller for fricatives than for plo- 
sives, since in Dutch the voiced-voiceless opposition is weaker for fricatives. 
Many speakers of Dutch tend to realize /z/ as [s], and even more speakers tend to 
realize /v/ as [f] in all positions in the word (e.g., Collins & Mees 1981:159; Gus- 
senhoven & Bremmer 1983:57). Furthermore, the voicing of fricatives is highly 
predictable after vowels, as within words voiced fricatives are nearly always pre- 
ceded by long vowels, and voiceless fricatives by short vowels. Finally, the dif- 
ference between alternating and non-alternating obstruents is represented in or- 
thography for plosives only. Final plosives that alternate in voice are always rep- 
resented as voiced, and non-alternating plosives are always represented as voice- 
less. To give an example, the orthographic representations mand and krant, which 
are both realized with [t], show that the former morpheme is realized with [d] in 
the plural, while the latter morpheme is always realized with [t]. Fricatives, in 
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contrast, are invariably represented as voiceless at the end of syllables. Their or- 
thographic representations reveal nothing of their alternating/non-alternating char- 
acter. Thus both [bas] ‘bass’ and [bas] ‘boss’ are orthographically transcribed 
with s (bas, baas), although the plural of [bas] is [bazon] bazen with a voiced [z]. 
In other words, whereas orthography reinforces the voiced-voiceless opposition 
for plosives, it does not do so for fricatives. 

The participants rated the voicing of the final obstruents on a five point scale. 
In order to ensure that this five point scale would correspond with the full range of 
completely voiceless to completely voiced, we included realizations with com- 
pletely voiced final obstruents in the experiment, which are unnatural in Dutch. 
Thus, the experiment also included realizations such as [pard] (from [part] 
‘horse’) and [kaz] (from [kas] ‘cheese’). 

The materials and the recording procedure are described in section 2. This sec- 
tion also reports acoustic analyses that we carried out in order to ascertain whether 
our speaker had realized alternating obstruents as weakly voiced, and in order to 
document the acoustic characteristics of the voiced final obstruents. In section 3, 
we present the actual rating experiment. Section 4 summarizes the findings and 
presents our conclusions. Moreover, it discusses how to incorporate our findings 
in the grammar of Dutch. 


2. Materials 

We selected 94 monosyllabic Dutch nouns, listed in the Appendix. Of these 
words, 30 end in alternating (e.g., mand), and 29 in non-alternating (e.g., krant) 
bilabial or alveolar plosives. The other 35 words end in labiodental or alveolar 
fricatives, of which 17 alternate in voice (e.g., slaaf ‘slave’ with the plural 
[slavon] slaven), and 18 are always voiceless (e.g., bes ‘berry’ with the plural 
[beson] bessen). 

We created a list containing two orthographic representations for each word. 
The word was spelled with a voiced final obstruent in one representation and with 
a voiceless obstruent in the other representation. Which representation is correct 
according to the spelling conventions of Dutch depends on the manner of articula- 
tion of the obstruent (plosive, fricative) and its realization in the word’s morpho- 
logical paradigm. The two versions of each word were presented right after each 
other, and the list thus started as follows: baard, baart, baas, baaz, band, bant, ... . 
We asked a male speaker of Dutch, who makes a clear distinction between all 
voiced obstruents in Dutch and their voiceless counterparts, to record the words in 
the list. He was instructed to realize final obstruents as voiced, when they were 
represented as voiced, and to realize them as voiceless, when they were repre- 
sented as voiceless. Although our speaker was not a phonetician, he did not need 
explicit instruction on how to realize word-final obstruents as voiced. Apparently, 
speakers of Dutch have ideas about how to realize word-final obstruents as 
voiced, even though voiced final obstruents do not occur in their language. The 
words were recorded on a DAT (BASF master 94) in a soundproof room by 
means of a portable DAT-recorder Aiwa HD S100 and a Sony microphone ECM 
MS957. The recordings were stored as .wav files (sample rate: 48 KHz) on a 
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computer by means of the speech analysis package Praat (Boersma 1996). Two 
phoneticians checked whether the final obstruents were realized as intended 
(voiced versus voiceless). If not, we asked our speaker to realize these words 
anew. In addition, we also asked our speaker to re-record words that he had real- 
ized with a schwa after the final obstruent. 

We then carried out acoustic measurements in order to investigate whether our 
speaker had realized the final obstruents with incomplete neutralization, and how 
the voiced final obstruents differed from the voiceless ones. We first measured the 
durations of the vowels. We defined the beginning of the vowel as the beginning 
of a regular pattern in the wave form with the characteristics of the vowel, and the 
end of the vowel as the (sudden) end of this regular pattern. In addition, we meas- 
ured the durations of the closures and release noises (the bursts plus the following 
periods of aspiration) of the final plosives, and the total durations of the final 
fricatives. We took the closure of the plosive to end at the sudden increase in am- 
plitude at the beginning of the burst, and we assumed that the fricative and the re- 
lease noise of the plosive end where the amplitude of the wave form is nearly 
identical to that of the background noise. Finally, we determined the proportion of 
the obstruent that was realized with vocal fold vibration. We assumed vocal fold 
vibration to be present if the waveform was periodic, the spectrogram contained a 
voice bar, and we could hear the vocal fold vibration in the acoustic signal. 

The durations of vowels and final consonants in a word are, among others, af- 
fected by the presence and quality of extra consonants in the coda (Waals 1999). 
Since our data set contains only few words with complex codas, and since these 
words differ in the quality of the extra consonant, we restricted all our acoustic 
analyses to the words with simplex codas. We investigated by means of analyses 
of variance whether the durations were affected by the phonological length of the 
vowel (long versus short), the manner of the obstruent (plosive, fricative), the ac- 
tual realization of this obstruent (voiced or voiceless), and its voice alternation 
(alternating, non-alternating). 

For the duration of the vowel, we found significant main effects for phono- 
logical length (F(1,101) = 654.09; p < 0.001) and for the manner of the following 
final obstruent (F(1,101) = 339.78; p < 0.001). The phonologically long vowels 
were on average 99.2 ms longer than the phonologically short vowels, and the 
vowels preceding fricatives were on average 62.6 ms longer than the vowels pre- 
ceding plosives. Furthermore, we observed an interaction between the phonologi- 
cal length of the vowel and the manner of the obstruent (F(1,101) = 6.32; 
p= 0.014). The difference between phonologically long and short vowels was less 
pronounced before plosives (81.5 ms) than before fricatives (101.9 ms). Finally, 
we observed an interaction between the manner of the obstruent and its actual re- 
alization (F(2,101) = 4.30; p = 0.017). Vowels preceding voiced fricatives were 
on average 16.0 ms longer than vowels preceding voiceless fricatives (see Figure 
1, upper panel), whereas actual realization did not affect the duration of vowels 
preceding plosives (actual realization for plosives: p > 0.1). In many languages, 
voiced obstruents are preceded by longer vowels than voiceless obstruents. This is 
also the case for intervocalic obstruents in Dutch (Slis & Cohen 1969). In con- 
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trast, if a speaker of Dutch realizes word-final obstruents as voiced, apparently 
this does not necessarily affect the length of the preceding vowel. 

The length of the final obstruent was affected by the manner of the obstruent 
(F(1,109) = 108.35; p < 0.001) and by its actual realization (F(1,109) = 95.28; 
p<0.001). Fricatives were on average 52.2 ms longer than plosives, and voiced 
obstruents were on average 48.2 ms shorter than voiceless obstruents (see Figure 
1, central panel). The length of a final obstruent may be a perceptual cue to its 
voicing. 
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Figure 1: The average duration of the vowel (upper panel), the average duration of the final 
obstruent (middle panel), and the average proportion of the final obstruent realized 
with vocal fold vibration (lower panel) for the words realized with voiced (+voice) and 
voiceless (—voice) final obstruents, broken for the manner of articulation of the final 
obstruent (plosive, fricative) 


We analysed separately the duration of the closure and the duration of the re- 
lease noise for plosives. Closure duration was affected by the phonological length 
of the preceding vowel (F(1,60) = 33.46; p < 0.001), by the place of articulation 
(bilabial or alveolar) of the plosive (F'(1,60) = 19.60; p < 0.001), and its actual re- 
alization (F(1,60) = 159.95; p < 0.001). Closures following phonologically long 
vowels (75.1 ms) were on average shorter than closures following short vowels 
(94.9 ms), and alveolar closures were on average shorter (82.1 ms) than bilabial 
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closures (104.4 ms). Interestingly, plosives that were actually voiced had longer 
closure durations (on average 108.9 ms) than plosives that were realized as voice- 
less (69.0 ms). This finding is in contrast with data for intervocalic positions, in 
which voiced plosives have shorter closures than voiceless plosives (e.g., Slis & 
Cohen 1969, Ernestus 2000). Probably, our speaker lengthened the voiced clo- 
sures in order to have the presence of vocal fold vibration come out well. Vocal 
fold vibration is an important cue to voicing (see below), but preceding vowels 
mask the presence of vocal fold vibration in directly following closures, since 
they are relatively very loud. By lengthening the closures, our speaker made the 
presence of vocal fold vibration clearly audible. We also observed an interaction 
between the actual realization of the plosive and the phonological length of the 
preceding vowel (F(1,60) = 8.94; p = 0.004). Whereas the difference in closure 
duration between voiced and voiceless plosives was on average 33.6 ms after 
short vowels, it was 54.0 ms after long vowels. Finally, we observed an interac- 
tion between the actual realization of the plosive and its place of articulation 
(F(1,60) = 15.45; p < 0.001). The actual voice realization had a greater effect on 
the closures of alveolar plosives (average difference between voiced and voiceless 
alveolars: 49.8 ms) than on the closures of bilabial plosives (average difference: 
16.7 ms). 

The duration of the release noise was affected by the phonological length of 
the preceding vowel (F(1,60) = 4.47; p = 0.039) and by the actual realization of 
the plosive (F(1,60) = 356.30; p < 0.001). Release noises were longer after long 
vowels than after short vowels (on average 97.5 ms and 87.8 ms, respectively), 
and they were longer when the plosive was realized as voiceless than when it was 
voiced (on average 130.7 ms and 50.8 ms, respectively). The effect of the actual 
realization was larger (interaction between actual realization and place of articula- 
tion F(1,60) = 16.58; p < 0.001) for the alveolar plosives (the average release 
noise duration was 137.5 ms for [t]s, and 46.2 ms for [d]s) than for the bilabial 
plosives (average release noise duration for [p]s: 114.9 ms; for [b]s: 61.2 ms). Im- 
portantly, the duration of the release noise was also affected by the voice alterna- 
tion of the plosive (F(1,60) = 5.58; p < 0.001). Release noises were shorter for 
alternating plosives (the average length was 84.7 ms) than for non-alternating plo- 
sives (93.7 ms). Our speaker realized the alternating plosives with weak voicing. 

Finally, we analyzed the proportion of the obstruent that was realized with vo- 
cal fold vibration. The analysis of variance showed that the manner of the obstru- 
ent (F(1,108) = 47.69; p < 0.001) and its actual realization (F(1,108) = 2568.41; 
p< 0.001) affected the relative duration of vocal fold vibration in the obstruent. 
On average, 78.9% of the total duration of an actually voiced obstruent, and only 
4.8% of a voiceless obstruent was realized with vocal fold vibration (see Figure 1, 
lower panel). This shows that our speaker realized final obstruents as voiced by 
keeping his vocal folds vibrating during a larger part of the obstruent. The differ- 
ence between voiced and voiceless realizations was more pronounced for frica- 
tives than plosives (interaction between actual realization and manner: F(1,108) = 
71.41; p < 0.001). On average, both voiceless fricatives and voiceless plosives 
were realized with vocal fold vibration during less than 6.0% of their total dura- 
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tion (on average 3.5% and 5.8%, respectively), but voiced fricatives were realized 
with vocal fold vibration during 92.3%, and voiced plosives during 69.6% of their 
total duration. 

In summary, our speaker signalled actual voicing especially by the presence of 
vocal fold vibration during a large part of the obstruent. In order to make the pres- 
ence of vocal fold vibration well audible in plosives, he lengthened their closures. 
In addition, our speaker shortened voiced obstruents, and lengthened vowels pre- 
ceding voiced fricatives (see also Figure 1). We found an effect of voice alterna- 
tion only on the duration of the release noises, which suggests that our speaker 
only realized alternating plosives with weak voicing. 


3. The rating experiment 
3.1 Method 

For the rating experiment, we divided the word tokens over two master lists. 
Each master list contained only one realization of each word type, that is, it con- 
tained either the realization with the voiced final obstruent (e.g., [krand]) or the 
realization with the voiceless final obstruent ([krant]). Furthermore, each master 
list contained both voiced and voiceless final obstruents, and both alternating and 
non-alternating obstruents. The words ending in the same type of obstruent (al- 
veolar plosive, bilabial plosive, alveolar fricative, and labiodental fricative) were 
blocked in order to keep the options (e.g., [p] or [b] or something in between) in a 
sequence of trials constant. This facilitates the participants’ task, which was diffi- 
cult since the participants were not used to judging speech sounds. Moreover, pre- 
senting the words in blocks might enhance the probability that participants would 
differentiate between the different tokens of obstruents of the same type. We cre- 
ated ten versions of each master list by randomizing the words in the blocks four 
times, while varying the order of the blocks. The resulting twenty lists are the lists 
with full words. 

One group of participants listened to the full words, while another group lis- 
tened to the rimes of the words starting at the steady states of the vowels. Thus, 
participants heard either [mant], [hond], etc., or [ant], [ond], etc. Since the rimes 
started at the steady state of the vowels, they contained no clear cues to the initial 
consonants, making it nearly impossible to trace the original words from which 
they had been spliced. Some of the initial consonants, however, might have been 
identifiable for some participants. In the worst case, this may have diminished the 
difference in scores between the participants hearing the full words and the par- 
ticipants hearing only the final rimes, which is the main interest of this study. 

Vowels were shorter in the rimes, but since they all started at the steady states, 
the relative difference in duration between vowels preceding alternating and non- 
alternating obstruents was approximately the same in the words as in the rimes, as 
was also shown by a Linear Mixed Effect (LME) model (Pinheiro & Bates 2000, 
Baayen, Tweedie & Schreuder 2002). This LME analysis had the duration of the 
vowel as its dependent variable, type of presentation (word, rime), phonological 
vowel length (long, short), the manner of the final obstruent (plosive, fricative), its 
voice alternation (alternating, non-alternating), and its actual realization (voiced, 
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voiceless) as independent variables, and word type as random effect variable. It 
revealed a main effect for the type of presentation (F(1,165) = 16.34; p < 0.001). 
Unsurprisingly, vowels were longer in the words than in the rimes. In addition, the 
model revealed main effects for the phonological length of the vowel (F(1,165) = 
626.53; p < 0.001), the manner of the final obstruent (F(1,165) = 305.99; 
p< 0.001), and the actual realization of the final obstruent (F(1,165) = 8.31; 
p= 0.005). These variables affected vowel duration as described in section 2. Fi- 
nally, the actual realization of the obstruent interacted with the type of the obstru- 
ent (F'(1,165) = 36.87; p < 0.001), also as described in section 2. Importantly, the 
type of presentation did not interact with voice alternation. We may therefore as- 
sume that the acoustic cues for incomplete neutralization, including release noise 
duration and cues that we did not discover, are roughly similar for the obstruents 
in both words and rimes. 

Some forty-six percent of the rimes (43 rimes) represented existing words of 
Dutch by themselves. For instance, the rime of [klet] kleed ‘carpet’ with the plural 
[kledon] represents the existing verb [et] eet ‘eat’, which has the plural eten [eton]. 
The voice alternation for these words is independent of the voice alternation for 
the original words. We may therefore still expect that, if lexical intraparadigmatic 
information affects perception, the ratings for the full words will be more in con- 
formity with the realizations of the obstruents in the words’ paradigms than the 
ratings for the rimes. 

The participants were tested individually, sitting in a dimly lit room in front of 
a PC monitor and a panel with two buttons. They were asked to listen to the words 
that would be presented to them, and to rate the final obstruents, depending on the 
block, as [b]s or [p]s, as [d]s or [t]s, as [v]s or [f]s, or as [z]s or [s]s, or as some- 
thing in between. Participants received for each block a response form with a five- 
point scale for every trial. The grapheme for the voiceless variant of the obstruent 
illustrated the left most position of the scale, while the grapheme for the voiced 
variant illustrated the right-most position. In (1), we present a line from the re- 
sponse form for the rating of alveolar plosives. 
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The course of a trial was as follows. The participant heard a warning beep of 377 
Hz for 500 ms, followed by a pause of 200 ms. The participant then heard the 
stimulus, and rated the voicing of the final obstruent on the five point scale. The 
experiment was self-paced. Participants were presented with a new word or rime 
only after they had indicated that they were ready by pushing the right button. 

Forty undergraduate students of the Radboud University Nijmegen were paid 
to participate in the experiment. Twenty students listened to the full words, while 
the other twenty students listened to the rimes. The students were all native speak- 
ers of Dutch, and did not report any hearing deficits. Most of them originated 
from the southern part of the Netherlands, and may therefore be assumed to dis- 
tinguish between the voiced and voiceless variants of both plosives and fricatives 
(Collins & Mees 1981:159, Gussenhoven & Bremmer 1983:57). 
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3.2 Results and discussion 

Figure 2 presents the average scores by the participants who heard the final 
rimes (upper panel) and the participants who listened to the full words (lower 
panel). The left panels give the scores for the plosives, while the right panels list 
the scores for the fricatives. The scores are broken down for the realization of the 
final obstruent as intended by the speaker and the alternating/non-alternating 
character of the obstruent. A score of 1 indicates that the final obstruent was per- 
ceived as completely voiceless. A score of 5 indicates that the obstruent was com- 
pletely voiced. 


Rimes: Plosives Rimes: Fricatives 
“ » 
2 2 
8 4 g oy 
% a 
& B 
Es é4 
a N a N 
—vi—alt -v/+alt +v/—alt +v/+alt -vi-alt -vi+alt +v/-alt +v/salt 
Complete words: Plosives Complete words: Fricatives 
“ “ 
g 2 
8 = go. 
a a 
eo & « 
8 % 
: ° [is] : ° 
=a =o 
—vi-alt —vi+alt +v/—alt +v/+alt —vi—alt —vitalt evi-alt +v/ealt 


Figure 2: The average scores for the final obstruents by the participants who heard only the rimes 
(upper panel) and the participants who heard the words (lower panel). The left panels show the 
scores for the plosives, while the right panels show the scores for the fricatives. The scores are 
broken for the actual realization of the final obstruent as voiced or voiceless (+v or —v) and its 

voice alternation (+alt or —alt) 


By means of a step-wise analysis of variance, we investigated whether the av- 
erage scores for the items were affected by the type of presentation (rime, word), 
the actual realization of the final obstruent (voiced, voiceless), its voice alternation 
(alternating, non-alternating), and its manner of articulation (plosive, fricative). 
We removed four outlier stimuli (hond and dood both realized with [t], schub real- 
ized with [b], and slurf realized with [v]) from the data set in order to improve the 
normality of the model’s residuals. The results are presented in Table 1. 
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Effects DF F-value p-value 
Manner 1 60.90 < 0.001 
Alternation 1 87.58 < 0.001 
Realization 1 12050.30 < 0.001 
Manner: Alternation 1 40.02 < 0.001 
Manner: Realization 1 64.12 < 0.001 
Alternation: Realization 1 52.30 < 0.001 
Alternation: Presentation 2 16.04 < 0.001 
Realization: Presentation 1 23.03 < 0.001 
Manner: Alternation: Realization 1 16.08 < 0.001 
Manner: Alternation: Presentation 2 14.40 < 0.001 
Manner: Realization: Presentation 1 14.42 < 0.001 
Alternation: Realization: Presentation 1 8.49 0.004 
Manner: Alternation: Realization: Presentation 1 6.63 0.010 


Table 1: Results of the stepwise analysis of variance of the average scores 


Figure 2 clearly shows that, unsurprisingly, by far the most important predictor 
for the average scores is the actual realization of the final obstruent. Whereas 
voiced obstruents were assigned an average score of 4.69, voiceless obstruents 
received an average score of only 1.54. This shows that listeners are sensitive to 
the acoustic cues of vowel duration, obstruent duration, and relative duration of 
vocal fold vibration for voicing. 

Voice alternation also is an important factor (see Table 1). Scores were on av- 
erage 0.28 higher for alternating obstruents than for non-alternating obstruents, 
but, as is clear from Figure 2, the effect is mainly carried by the voiceless plosives 
(as is supported by the interactions between voice alternation and manner of ar- 
ticulation, between voice alternation and actual realization, and between voice al- 
ternation, manner of articulation, and actual realization, as reported in Table 1). 
Whereas the average effect of voice alternation did not exceed 0.10 for the frica- 
tives and the voiced plosives, it was 0.70 for the voiceless plosives. The focus of 
this study is on the interaction between voice alternation and type of presentation. 
Figure 2 suggests that an interaction is present for the voiceless plosives and the 
voiced plosives, but that it is absent for the fricatives. This is supported by sepa- 
rate analyses of variance on these four types of obstruents (see Table 2), as well as 
by the interactions between voice alternation and type of presentation with man- 
ner of articulation and actual realization in the overall analysis (Table 1). Voice 
alternation affects the rating for the voiceless plosives both in the rime condition 
(F(1,55) = 6.99; p = 0.011) and in the full word condition (F(1,56) = 151.11; 
p< 0.001), but the effect in the full word condition is larger (on average 1.11 in 
the full word condition versus 0.27 in the rime condition). The interaction be- 
tween voice alternation and type of presentation is weaker for the voiced plosives 
than for the voiceless plosives (an analysis of just the plosives shows an interac- 
tion between voice alternation, type of presentation, and actual realization: 
F(1,225) = 12.12; p < 0.001), but again voice alternation has a larger effect on the 
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full words than on the rimes (an insignificant average difference of —.04 for the 
rimes and a significant difference of 0.24 for the full words). Since the words and 
the rimes do not differ in acoustic incomplete neutralization, the interaction of 
voice alternation with type of presentation must be due to lexical information that 
listeners access upon hearing words. The difference between words and rimes 
therefore shows that, at least for the final plosives in words, the effect of voice 
alternation is not only mediated by incomplete neutralization in the acoustic sig- 
nal. 


Type of obstruent F-value p-value 
Voiceless plosives 38.04 < 0.001 
Voiced plosives 10.00 0.002 
Voiceless fricatives 0.70 >0.1 
Voiced fricatives 0.33 >0.1 


Table 2: The interaction between voice alternation and type of 
presentation in separate analyses of variance of the 
average scores for the four types of final obstruents 


The interaction between voice alternation and type of presentation may be 
smaller for the voiced plosives than for the voiceless plosives, because the voiced 
realizations were unnatural for the listeners, who consequently scored nearly all 
voiced realizations as completely voiced. In addition, we may be observing a ceil- 
ing effect, because the voiced obstruents were already maximally voiced. The ab- 
sence of an interaction for fricatives suggests that their voiced-voiceless opposi- 
tion is too weak for our listeners, that is, there is not sufficient intraparadigmatic 
information for listeners to rely on for fricative final words. 

The effects of the actual realization on the rating scores support the hypothesis 
that the ratings for the plosives in the full word condition were affected by lexical 
information. The obstruent’s realization is expected to have a larger effect on the 
rating scores when listeners rely more on the acoustic signal. We find that while 
the difference between voiced and voiceless fricatives is approximately the same 
in the rime and in the full word condition, the difference in scores between voiced 
and voiceless plosives is smaller for words than for rimes (2.76 versus 3.20; see 
the interaction between actual realization, type of presentation, and manner of ar- 
ticulation given in Table 1). Apparently, listeners based their ratings less on the 
acoustic signal when they could identify the word (full word condition) and the 
obstruent was a plosive. In these cases, listeners based their ratings also on the 
paradigm of the word. 

Our results hardly change if we restrict our analysis to the words of which the 
rimes are not existing words of Dutch by themselves. Such an analysis shows ex- 
actly the same main effects and interactions, except that the interaction between 
voice alternation, actual realization, and type of presentation is missing, probably 
due to the smaller number of data points. This shows that the effect of voice alter- 
nation on the scores for the rimes cannot be due only to the rimes that represent 
words by themselves, and thus to intraparadigmatic information that listeners ac- 
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cess upon hearing these rimes. The effect of voice alternation on the rimes is 
mainly due to incomplete neutralization in the signal. 

In conclusion, alternating voiceless plosives were rated as more voiced than 
nonalternating voiceless plosives, especially when presented in full words. This 
finding shows that intraparadigmatic effects on listeners’ voicing rates are medi- 
ated not only by incomplete neutralization in the acoustic signal, but also by lis- 
teners’ paradigmatic knowledge. 


4. General discussion and conclusion 

This study addresses the question of whether intraparadigmatic effects in per- 
ception are mediated only by incomplete neutralization in the acoustic signal. We 
carried out a transcription experiment in which Dutch listeners who were not pho- 
netically trained were presented with either full words or the final rimes of these 
same words, and were asked to rate the final obstruents as voiced or voiceless on a 
five point scale. Half of the words end in obstruents that are generally realized as 
voiceless, while the other half end in obstruents that alternate in voice, that is, are 
realized as voiced in some members of the words’ paradigms. 

The stimuli were recorded by a speaker of Dutch who realized final obstruents 
as voiced mainly by shortening them and realizing them with vocal fold vibration 
during a longer period. He lengthened preceding vowels only for fricatives. Inter- 
estingly, the cues that our speaker used to signal voicing were different from the 
cues that are used by speakers of English, for whom the duration of the preceding 
vowel is a major cue for all types of obstruents (e.g., Denes 1955). This difference 
may help explain the difficulties that speakers of Dutch experience with the voic- 
ing of word-final obstruents in English: Dutch speakers may focus on acoustic 
cues that are less relevant for English. 

Acoustic analyses showed an effect of voice alternation only on the durations 
of the release noises of plosives. This is in line with the study by Emestus and 
Baayen on pseudowords (2006), in which voice alternation was also found to af- 
fect only release noise duration. Possibly, we found no effect of voice alternation 
on vowel duration, in contrast to Warner et al. (2004), because the words with al- 
ternating and non-alternating obstruents in our experiment differed in their phono- 
logical make-up (i.e., they were not minimal pairs), and differences caused by the 
surrounding environment obscured any systematic vowel duration difference. 

The main predictor for the average voicing scores by the listeners was the 
voice realization as intended by the speaker. In addition, scores were higher for 
alternating plosives than for non-alternating plosives. For the voiceless plosives, 
this was the case both in the rime and in the full word condition, while for the 
voiced plosives it was only so in the full word condition. Participants listening to 
rimes could not well identify the presented words in most cases, and their rating 
must therefore have been based mainly on the acoustic signal. We think that the 
intraparadigmatic effects on their scores are predominantly the result of incom- 
plete neutralization in the acoustic signal, and therefore made possible by in- 
traparadigmatic effects in production. In contrast, participants listening to full 
words could identify the presented words in all cases, and their scores could there- 
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fore be strongly affected by lexical knowledge as well. Indeed, the intraparadig- 
matic effects observed for the voiceless plosives were greater in the full word 
condition than in the rime condition. This shows that the knowledge of a word’s 
paradigm also affects the interpretation of voicing. We conclude that intrapara- 
digmatic effects in perception are partly mediated by the acoustic signal, and 
partly induced by the listener’s lexical knowledge. 

Note that our transcribers’ intraparadigmatic knowledge in fact prevented 
them from basing their scores completely on the acoustic signal. Clearly, at least 
for non-trained transcribers, intraparadigmatic knowledge is in the way of pho- 
netic transcriptions, intended as objective representations of the acoustic signal. 
Thus intraparadigmatic knowledge presents another problem for objective pho- 
netic transcriptions, which Vieregge (1987) already claimed to be impossible. Fur- 
thermore, this finding also shows that intraparadigmatic effects mediated by lis- 
teners’ lexical knowledge are automatic. They arise even when this is counterpro- 
ductive for the task that listeners have to carry out. 

The intraparadigmatic effects in perception mediated by the listeners’ knowl- 
edge may well be the motor behind incomplete neutralization in the acoustic sig- 
nal in production. The alternating and non-alternating obstruents that were voice- 
less differed in their scores by only 0.21 in the rime condition but 0.73 in the full 
word condition. In other words, listeners’ knowledge of the morphological para- 
digms give rise to an effect in the ratings that may be two times as big (0.73—0.21 
versus 0.21) as that caused just by incomplete neutralization in the acoustic signal. 
The differences that listeners perceive between alternating and non-alternating 
obstruents is magnified by their knowledge of the paradigms, and this may en- 
courage them to maintain these differences in their speech. 

The intraparadigmatics of perception reported here instantiate in fact a subtle 
form of paradigmatic levelling. Steriade (2000) claims that paradigmatic levelling 
may affect phonemes as well as subphonemic characteristics. What we have 
shown here is that subphonemic paradigmatic levelling is not restricted to produc- 
tion, but is also a characteristic of perception. 

Only the scores for final plosives were clearly affected by the words’ para- 
digms. Recall that we found no evidence for incomplete neutralization in the sig- 
nal for fricative-final words. This may explain why the alternating character of 
final fricatives did not affect the scores by the participants who heard just the final 
rimes. The fact that also the participants listening to the full words showed no ef- 
fects, even though they could rely on their intraparadigmatic knowledge, may be 
explained by the weakness of the voiced-voiceless opposition for fricatives in 
Dutch. This opposition is not well maintained by most speakers of Dutch, it is 
hardly distinctive after vowels, and it is not supported by the spelling conventions 
of Dutch in syllable-final position. 

Our listeners scored those voiceless obstruents as slightly voiced that are 
spelled as voiced. Nevertheless, the intraparadigmatic effects reported here cannot 
only be due to orthography, since our participants also showed sensitivity to in- 
complete neutralization in the signal when listening to rimes that do not represent 
words by themselves, and for which they consequently could not rely on the spell- 
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ing. Moreover, also other studies show that intraparadigmatic effects on the pro- 
duction and comprehension of voicing can be independent of orthography. As al- 
ready mentioned in section 1, Dinnsen & Charles-Luce (1984) and Charles-Luce 
(1993) have shown that incomplete neutralization is also present in Catalan mini- 
mal word pairs for which the spelling does not reflect the difference between al- 
ternating and non-alternating obstruents. Furthermore, we have shown in a previ- 
ous study that Dutch listeners are sensitive to weak voicing in fricatives, even 
though voiceless fricatives are always spelled as voiceless (Ernestus & Baayen 
2006). Finally, we also have evidence that intraparadigmatic effects that are not 
mediated by incomplete neutralization do not just result from orthography. In a 
follow-up study (Ernestus & Baayen 2007), we presented the stimuli from the pre- 
sent rating experiment in a lexical decision experiment. We found that the fre- 
quency with which an alternating plosive is realized as voiced relative to the fre- 
quency with which it is voiceless affects response latencies. This frequency effect 
cannot be due to orthography, since both the voiced and voiceless realizations for 
alternating plosives are spelled as voiced. Intraparadigmatic effects need not be 
supported by orthography. 

We now turn to the question of how to incorporate our findings in the gram- 
mar of Dutch. In generative grammar, intraparadigmatic effects are traditionally 
accounted for by means of underlying forms. A morpheme-final obstruent that is 
voiced before vowel-initial suffixes is also voiced in the underlying form, as al- 
ready mentioned in the introduction of this paper. Thus, the underlying form of 
[mant] is /mand/ because of the voiced [d] in the plural [mandon]. Incomplete 
neutralization can be accounted for by the assumption that the voicing of the final 
obstruent in the underlying form affects the production of the obstruent via pho- 
netic implementation rules preceding or coinciding with Final Devoicing, or re- 
placing Final Devoicing (Dinnsen & Charles-Luce 1984, Port & O’Dell 1985, 
Slowiaczek & Dinnsen 1985). 

A generative account of the data with a rule (or constraint) of Final Devoicing 
supplemented with various independent phonetic implementation rules (or con- 
straints) is possible, but cumbersome. What is required to account for the data is 
(1) information as to whether a word-final (voiceless) obstruent is alternating, (2) 
Final Devoicing, and (3) phonetic implementation rules (constraints) that weakly 
(re)voice voiceless obstruents. By Occam’s razor, we prefer a theory in which 
phonetic realization rules directly produce the correct form from the alternation 
information in the lexicon to a theory that devoices underlyingly voiced segments 
while partly re-voicing them in another stage in the derivation. In the OT- 
framework, we prefer a theory in which the constraint of Final Devoicing is sim- 
ply dispensed with. 

This line of argument can be taken a step further, since the mental lexicon con- 
tains form representations for a great many words, including inflected forms (e.g., 
Jackendoff 1975, Baayen, Dijkstra & Schreuder 1997, Alegre & Gordon 1999, 
Baayen, McQueen, Dijkstra & Schreuder 2003). Given that nearly every word is 
lexically represented, we may assume lexical representations that directly reflect 
the words’ pronunciations. That is, the singular mand may be represented as 
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/mant/ and the plural manden as /manden/. As a consequence, all theoretical ap- 
proaches based on lexical representations that do not reflect actual pronunciation 
are unnecessarily complex, irrespective of whether they assume Final Devoicing. 
Hence, by Occam’s razor, we also disprefer accounts in which phonetic imple- 
mentation rules are directly applied to underlying voiced obstruents or to archi- 
phonemes unspecified for voice (cf. Trubetzkoy 1958, Lieb 1998), even though 
these accounts can easily incorporate incomplete neutralization. 

Following Bybee (2001; see also Ernestus & Baayen 2006), we view in- 
traparadigmatic effects in production and perception as resulting from lexical 
analogy that is not mediated by abstract underlying representations nor by an idio- 
syncratic feature that marks a final obstruent as alternating or non-alternating. 
When speakers realize [mant] in production, they activate [mant] as well as 
[mandon]. The plural contains [d], which may affect the realization of the word- 
final [t], resulting in weak voicing. In contrast, when speakers realize [krant], the 
inflectional paradigm members do not contain [d], and they do not slightly voice 
the word-final obstruent. 

Lexical analogy can also explain intraparadigmatic effects in perception. 
When listeners perceive a word, this word as well as the morphologically related 
words are activated in the listeners’ mental lexicon, and these paradigmatic com- 
petitors codetermine the listeners’ percept. Formalized analogical models, such as 
Skousen’s Analogical Model of Language (Skousen 1989, 1993), can easily in- 
corporate these intraparadigmatic effects. 

One of the predictions of our approach, that we leave for further research, is 
that alternating intervocalic voiced obstruents would be realized and perceived as 
less voiced than their non-alternating counterparts. For instance, the alternating 
[d] of Dutch [biden] bidden ‘to pray’ (with the singular present-tense [bit] bid) 
would be produced and perceived as less voiced than the non-alternating [d] of 
[midon] midden ‘middle’. Note that under a generative account with abstract un- 
derlying representations, we would not expect a difference between [biden] and 
[midon], since both forms underlyingly contain /d/ (/biden/, /midon/). Thus, alter- 
nating and non-alternating voiced intervocalic obstruents would form a good test 
case for the generative account and the lexical paradigmatic account. 

To conclude, the main point of this study is that the intraparadigmatic realiza- 
tion of an obstruent does not only affect its production but also its perception. The 
intraparadigmatic effects in perception are partly mediated by incomplete neu- 
tralization in the acoustic signal, and partly arise due to listeners’ lexical knowl- 
edge. These intraparadigmatic effects are automatic, and prevent listeners from 
producing accurate phonetic transcriptions that are true reflections of the acoustic 
signal itself. 


Acknowledgements 
We would like to thank Louis Pols, Erik Jan van der Torre, Jeroen van de Weijer, and an anony- 
mous reviewer for their helpful comments on an earlier version of this paper. 


INTRAPARADIGMATIC EFFECTS 171 


References 

Alegre, Maria & Peter Gordon. 1999. “Frequency Effects and the Representa- 
tional Status of Regular Inflections”. Journal of Memory and Language 40.41- 
61. 

Baayen, R. Harald, Ton Dijkstra & Rob Schreuder. 1997. “Singulars and Plurals 
in Dutch: Evidence for a Parallel Dual Route Model”. Journal of Memory and 
Language 37.94-117. 

Baayen, R. Harald, James McQueen, Ton Dijkstra & Rob Schreuder. 2003. “Fre- 
quency Effects in Regular Inflectional Morphology: Revisiting Dutch Plurals”. 
Morphological Structure in Language Processing ed. by R. Harald Baayen & 
Rob Schreuder, 355-390. Berlin: Mouton de Gruyter. 

Baayen, R. Harald, Fiona J. Tweedie & Rob Schreuder. 2002. “The Subjects as a 
Simple Random Effect Fallacy: Subject variability and Morphological Family 
Effects in the Mental Lexicon”. Brain and Language 81.55-65. 

Boersma, Paul. 1996. Praat: Doing Phonetics by Computer. Ms., University of 
Amsterdam. 

Booij, Geert E. 1995. The Phonology of Dutch. Oxford: Clarendon Press. 

Bybee, Joan L. 2001. Phonology and Language Use. Cambridge: Cambridge Uni- 
versity Press. 

Charles-Luce, Jan. 1993. “The Effects of Semantic Context on Voicing Neutrali- 
zation”. Phonetica 50.28-43. 

Collins, Beverley S. & Inger Mees. 1981. The Sounds of English and Dutch. Lei- 
den: Leiden University Press. 

Cucchiarini, Catia. 1993. Phonetic Transcription: A Methodological and Empiri- 
cal Study. Ph.D. dissertation, University of Nijmegen. 

Denes, Paul. 1955. “Effect of Duration on the Perception of Voicing”. Journal of 
the Acoustical Society of America 27.761-764. 

Dinnsen, Daniel A. & Jan Charles-Luce. 1984. “Phonological Neutralization, Pho- 
netic Implementation and Individual Differences”. Journal of Phonetics 12.49- 
60. 

Emestus, Mirjam. 2000. Voice Assimilation and Segment Reduction in Casual 
Dutch. A Corpus-Based Study of the Phonology-Phonetics Interface. Ph.D. 
dissertation, Vrije Universiteit Amsterdam. 

Emestus, Mirjam & R. Harald Baayen. 2003. “Predicting the Unpredictable: In- 
terpreting Neutralized Segments in Dutch”. Language 79.5-38. 

Emestus, Mirjam & R. Harald Baayen. 2006. “The Functionality of Incomplete 
Neutralization in Dutch: The Case of Past-Tense Formation”. Laboratory Pho- 
nology 8 ed. by Louis M. Goldstein, D. H. Whalen & Catherine T. Best, 27- 
49. Berlin: Mouton de Gruyter. 

Emestus, Mirjam & R. Harald Baayen. 2007. “Paradigmatic Effects in Auditory 
Word Recognition: The Case of Alternating Voice in Dutch”. Language and 
Cognitive Processes 22.1-24. 

Fourakis, Marios & Gregory K. Iverson. 1984. “On the ‘Incomplete Neutraliza- 
tion’ of German Final Obstruents”. Phonetica 41.140-149. 


172 MIRJAM ERNESTUS & R. HARALD BAAYEN 


Gussenhoven, Carlos & Rolf H. Bremmer Jr. 1983. “Voiced Fricatives in Dutch: 
Sources and Present-Day Usage”. North-Western European Language Evolu- 
tion 2.55-71. 

Jackendoff, Ray S. 1975. “Morphological and Semantic Regularities in the Lexi- 
con”. Language 51.639-671. 

Jongman, Allard, Joan A. Sereno, Marianne Raaijmakers & Aditi Lahiri. 1992. 
“The Phonological Representation of [Voice] in Speech Perception”. Lan- 
guage and Speech 35.137-152. 

Kemps, Rachel, Mirjam Ernestus, Rob Schreuder & R. Harald Baayen. 2005. 
“Prosodic Cues for Morphological Complexity: The Case of Dutch Noun Plu- 
rals”. Memory and Cognition 33.430-446. 

Lahiri, Aditi, Allard Jongman & Joan A. Sereno. 1990. “The Pronominal Clitic 
d’r in Dutch: A Theoretical and Experimental approach”. Yearbook of Mor- 
phology 3.115-127. 

Lieb, Hans-Heinrich. 1998. “Morph, Wort, Silbe: Umrisse einer Integrativen Pho- 
nologie des Deutschen”. Variation und Stabilitdét in der Wortstruktur: Unter- 
suchungen zu Entwicklung, Erwerb und Varietdten des Deutschen und anderer 
Sprachen ed. by Matthias Butt & Nanna Fuhrhop (= Germanistische Linguistik 
141-142), 334-407. Hildesheim: Georg Olms Verlag. 

Pinheiro, José C. & Douglas M. Bates. 2000. Mixed-Effects Models in S and S- 
PLUS. New York: Springer. 

Port, Robert & Penny Crawford. 1989. “Incomplete Neutralization and Pragmatics 
in German”. Journal of Phonetics 17.257-282. 

Port, Robert & Michael O’Dell. 1985. “Neutralization of Syllable-Final Voicing 
in German”. Journal of Phonetics 13.455-471. 

Skousen, Royal. 1989. Analogical Modeling of Language. Dordrecht: Kluwer. 

Skousen, Royal. 1993. Analogy and Structure. Dordrecht: Kluwer. 

Slis, Iman H. & Antonie Cohen. 1969. “On the Complex Regulating the Voiced- 
Voiceless Distinction”. Language and Speech 12.80-102; 137-155. 

Slowiaczek, Louisa M. & Daniel A. Dinnsen. 1985. “On the Neutralizing Status 
of Polish Word-Final Devoicing”. Journal of Phonetics 13.325-341. 

Steriade, Donca. 2000. “Paradigm Uniformity and the Phonetics-Phonology 
Boundary”. Papers in Laboratory Phonology V: Language Acquisition and the 
Lexicon ed. by Michael Broe & Janet Pierrehumbert, 313-334. Cambridge: 
Cambridge University Press. 

Trubetzkoy, Nikolai S. 1958. Grundztige der Phonologie. Gottingen: 
VandenHoeck & Ruprecht. 

Vieregge, Wilhelm. 1987. “Basic Aspects of Phonetic Segmental Transcription”. 
Probleme der Phonetischen Transkription ed. by Antonio Almeida & Ange- 
lika Braun, 5-55. Stuttgart: Franz Steiner. 

Waals, Juliette. 1999. An Experimental View of the Dutch Syllable. The Hague: 
Holland Academic Graphics. 

Warner, Natasha, Erin Good, Allard Jongman & Joan A. Sereno. 2006. “Ortho- 
graphic vs. Morphological Incomplete Neutralization Effects”, Journal of 
Phonetics 34.285-293. 


INTRAPARADIGMATIC EFFECTS 173 


Warner, Natasha, Allard Jongman, Joan A. Sereno & Rachel Kemps. 2004. “In- 
complete Neutralization and Other Sub-Phonemic Durational Differences in 
Production and Perception: Evidence from Dutch”. Journal of Phonetics 32. 
251-276. 

Zonneveld, Wim. 1983. “Lexical and Phonological Properties of Dutch Devoicing 
Assimilation”. Sound Structures: Studies for Antonie Cohen ed. by Marcel P. 
R. van den Broecke, Vincent J. van Heuven & Wim Zonneveld, 297-312. Dor- 
drecht: Foris. 


174 MIRJAM ERNESTUS & R. HARALD BAAYEN 


Appendix 


Experimental words ending in obstruents that are voiced in inflectionally related 
words: 


krib ‘manger’ kleed ‘cloth’ grens ‘border’ 
kwab ‘lobe’ koord ‘cord’ hals ‘neck’ 

rib ‘rib’ maand ‘month’ kaas ‘cheese’ 
schub ‘scale’ mand ‘basket’ laars ‘boot’ 

web ‘web’ moord ‘murder’ muis ‘mouse’ 
baard ‘beard’ naald ‘needle’ neus ‘nose’ 

bed ‘bed’ oord ‘place’ prijs ‘price’ 
brand ‘fire’ paard ‘horse’ spijs ‘food’ 
brood ‘bread’ strand ‘beach’ korf ‘basket’ 
bruid ‘bride’ tand ‘tooth’ scherf ‘fragment’ 
dood ‘dead’ veld ‘field’ slaaf ‘slave’ 
eend ‘duck’ vod ‘rag’ slurf ‘trunk’ 
hand ‘hand’ woord ‘word’ staaf ‘bar’ 

held ‘hero’ zwaard ‘sword’ wolf ‘wolf’ 
hemd ‘shirt’ baas ‘boss’ zalf ‘ointment’ 
hond ‘dog’ gans ‘goose’ 


Experimental words ending in obstruents that are always voiceless: 


klap ‘bang’ lat ‘slat’ fles ‘bottle’ 
mep ‘clout’ lint ‘ribbon’ kous ‘stocking’ 
schep ‘scoop’ maat ‘measure’ pols ‘wrist’ 
stip ‘dot’ mot ‘moth’ tas ‘bag’ 
strip ‘strip’ pet ‘cap’ tros ‘cluster’ 
beurt ‘turn’ pit ‘pip’ vis ‘fish’ 
cent ‘cent’ poort ‘gate’ zeis ‘scythe’ 
fluit ‘flute’ put ‘well’ bef ‘jabot’ 
geit ‘goat’ schat ‘treasure’ juf ‘female 
grot ‘cave’ scheut ‘twinge’ teacher’ 
hert ‘deer’ spruit ‘sprout’ nimf ‘nymph’ 
kat ‘cat’ staart ‘tail’ plof ‘thud’ 
klant ‘customer’ bes ‘berry’ rif ‘reef’ 
knot ‘knot’ bos ‘woods’ slof ‘slipper’ 
krat ‘crate’ dans ‘dance’ straf ‘punish- 
kreet ‘cry’ eis ‘require- ment’ 


krot ‘slum’ ment’ 
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Bulgarian 100 


G3 
Catalan 1, 12, 154, 155, 169 
Cushitic 43 


D. 
Danish 100 
Dutch 
Beuningen 82 
Brabantish 89 
eastern varieties 81 
Ghent 85 
Groningen 127, 130 
Noord-Deurningen 82 
northern varieties 127 
Rossum 82 
southern varieties 81, 84, 126, 130, 
144, 163 
Standard Dutch 1, 81, 86, 88, 89, 91, 
127, 132 
Tilligte 84, 86 
Twente 91 
western varieties 127, 130, 144 


E. 

English 12, 14, 15, 27, 30, 32 n.4, 33 n.13, 
33 n.17, 33 n.18, 33 n.20, 34 n.29, 41, 
42, 43, 44, 45, 46, 47, 53, 55, 56, 57, 
58-74, 75 n.1, 76 n.11, 76 n.15, 77, 
90, 100, 103, 105, 106, 107, 108, 110, 
130, 132, 140, 141, 142, 143, 146, 
147, 167 
American English 60, 69, 142 
Canadian English 102 
Standard English 130 
Yorkshire English 32 n.12 


F. 

Flemish 81, 89 

French 33, 44, 100, 147 n.6 
Canadian French 102 
European French 102 
Parisian French 32 n.12 

Frisian 44, 95 n.1, 95n.6 
West-Frisian 95 n.6 


G. 

Georgian 14 

German 1, 12, 14, 27, 29, 33 n.18, 34 n.29, 
41, 42, 43, 44, 45, 46, 47, 52-54, 55, 
58, 63, 69, 74, 75 n.9, 76, 91, 95 n.10, 
100, 103, 130, 154 
Standard German 130 

Germanic 30, 41, 44, 74, 100 
West-Germanic 81, 88, 141 

Greek, Modern 90 


H. 

Hebrew, Tiberian 34 n.29 

Hindi 42, 76 n.11 

Hungarian 33 n.18, 95 n.10, 147 n.6 


I. 
Icelandic 24, 90 
Igbo 43 


J. 
Japanese 26, 27, 100 


K. 
Kwa 43 


M. 
Maori 14 
Marathi 76 n.11 


N. 
Navajo 28, 33 n.18 


P. 

Pali 90 

Persian 27 

Polish 1, 12, 14, 32 n.12, 33 n.18, 33 n.20, 
33 n.21, 100, 101, 107, 154 


R. 
Russian 1, 33 n.18, 95 n.10, 100 
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S. Turkish 19, 32 n.12, 33 n.21, 87, 88 
Sanskrit 19, 90 
Slavic languages 95 n.10 U. 
Spanish 27, 42, 44, 100 Uyghur 27 

Latin American Spanish 27 
Swedish 14, 33 n.18 Y. 

Ya:the 32 n.12 

T. Yiddish 12, 14, 33 n.18, 33 n.21, 44, 95 
Taiwanese 141 n.1, 147 n.6 


Thai 75 n.4, 90, 99 


Subject Index 


A. 
abduction 45, 140, 143, 144ff 
abstractness 9, 12, 45, 72-74, 75 n.8, 81, 
84, 94, 105, 154, 170 
acoustics 43, 44, 45, 55, 59, 60, 65, 66, 74, 
102-108, 113, 118, 120, 121, 122, 
125, 127, 128, 130, 145, 153-159, 
163, 165, 166, 167, 168, 170 
acquisition 22, 27, 41ff 
CLPF database 47 
development curve 50 
Nijmegen database in Childes 52 
production 42, 46, 63ff 
active voicing 143ff 
allomorphy 32 n.6, 155 
alveolars 50, 51, 53, 61, 66, 69, 102, 104, 
109, 110, 144, 158, 160ff 
analogy 153, 170 
Articulatory Effort Hypothesis 54f, 74 
Articulatory Phonology 140 
aspiration 41ff, 90, 91, 99, 100, 107, 108, 
120, 130, 145, 155, 159 
aspiration languages 41ff 
assimilation: see voicing assimilation 
associative priming experiment 99ff 


B. 

binary vs. unary features 1, 10, 11, 12, 32 
n.12, 33 n.12, 33 n.20, 42ff, 139, 147 
n.6 

burst (of plosives) 103f, 108, 113, 129, 
135, 153, 155, 159 
spectral centre of gravity 103 


Cc. 
child-directed speech 5O0ff 
Childes database 47, 52, 53, 60, 75 
clitics 16, 32 n.5, 34.n.25, 95 n.7, 156 
closure (of plosives) 55, 72, 99ff, 120, 
129, 159ff 
clusters 1, 3, 4, 10, 11, 12, 14, 28, 29, 30, 
34 n.27, 34 n.29, 89, 91, 94, 95, 125ff 
coarticulation 127, 140ff 
consonant harmony 55f, 70 
constraints (OT) 
Agree 12ff 
constraint demotion 22 
constraint families 19, 31 
coranking 21, 22, 33 n.23 


domains 13, 27, 33 n.22, 33 n.24 
FinalDevoicing 17, 22, 81ff 
fixed ranking 28 
Ident-OO 24, 25, 26, 31, 34n.25, 
84ff 
IDLaryngeal 13ff 
IDOnsetLar 13ff 
*Lar 1, 2, 12ff 
Lyman’sLaw 26ff 
MaxLar 30f 
metaconstraints 2, 19, 26, 29, 31 
Multilink 9Off 
*OnsetFric 26ff 
RealizeMorpheme 93f 
stratified hierarchy 22, 23 
Vaux’sLaw 90ff 
contrast, final vs. initial 59, 71 
core vs. periphery in phonology 32 n.11 
coronals 50, 62, 75 n.3, 82, 92, 95 n.3 
cricothyroid 142f 


D. 
de-aspiration 41 
de-aspiration errors 47, 52, 58, 60, 
63, 74 
degemination 3, 4, 34 n.27 
delinking 10ff, 69f, 89f, 96 n.14 
dentals 88, 102 
devoicing 
active devoicing 143ff 
devoicing errors 46f, 50ff, 63ff 
devoicing harmony 57f 
fricative devoicing 1, 2, 4, 5, 10, 12, 
15-17, 23, 26, 33 n.15, 34.29, 95 
n.8, 126 
initial devoicing 53, 60ff 
devoicing: see also final devoicing 
diachronic phonology 41, 84, 90 
dialectal variation 72, 81ff, 130, 143 
‘Dimension’ theory 33 n.12, 75 n.1 
diminutive suffix 32 n.6 
discrimination task 99, 100, 115ff 
dissimilation 34 n.26 
‘Duke of York gambit’ 11, 32 n.4 
d-weakening 24 


E. 
ejectives 75 n.1 
Elsewhere condition 32 n.4 
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empty vowel position 81, 84, 86, 87, 93ff 
extrametricality 33 n.13, 33 n.17 


F. 

FO 103, 104, 129ff, 138ff 

F1 130, 138f 

faithfulness 73, 76 n.17, 83ff, 92, 95 n.8, 
96 n.15 
positional faithfulness 1, 13, 28ff 

final devoicing 1ff, 58f, 72f, 81ff, 125, 
154, 169f 
exceptions 81ff 

fortition 90 

fricatives and voicing 81, 82, 85ff, 125, 
126, 127, 129, 135ff, 157ff 

fricatives and voicing: see also fricative 
devoicing 

Fusion 11f 


G. 

geminates 31 n.2, 89, 91ff 

glottalization 55, 72, 143 

glottal stop 129, 136, 146 

Glottal Width 75 n.1, 144 

glottis 100, 101, 140 
enlargement of the supraglottal cavity 
100ff 
transglottal pressure 100f, 143f 

GTR database 82, 91 


H. 

/h/ 125, 127, 130ff, 145ff, 147 n.1, 147 n.4 
Harms’s Generalization 14f, 33 n.20 
hypocoristics 27 


I. 

identity priming experiment 106, 110, 
113ff 

implosives 75 n.1 

initial devoicing 53, 60ff 

initial voicing 54, 58, 61, 66ff 

input frequency 52, 74 

Input-Output identity 24ff, 83 

intervocalic 55, 91f, 134, 145, 159, 161, 
170, 

intrusive rin English 32 n.4 


L. 

labials 5Off, 61ff, 66, 75 n.3, 82, 88, 95 
n.3, 102, 104, 105, 109, 110, 125, 
127, 144, 158, 160ff 


laryngeal features, representation of 33 
n.12, 41ff 

laryngeal harmony 41, 43, 55f, 65, 70ff 

laryngeal node 10, 11, 32 n.10, 89, 90, 96 
n.14 

length: see geminates; vowel length 

lengthening: see open syllable lengthening 

lenition 52, 54, 63 

lexical activation 105ff, 113, 118, 121 

lexical vs. postlexical phonology 32 n.12 

lexicon, mental 154, 169, 170 

licensing of [voice] 10ff 

loanwords 7, 27, 30, 32 n.2, 32 n.3, 34 
n.29, 100 

locality 73, 74, 56, 136 


M. 

markedness 12, 16, 31, 46, 65, 92, 142 
markedness conventions 10 
positional markedness 1, 28ff 

monovalency: see binary vs. unary features 

morphology 11, 14, 18, 23, 24, 33 n.13, 
81, 82, 83, 87, 94, 153ff, 168, 170 
Item-and-Arrangement model 83 
root vs. affix distinction 16, 19 
verbal stems 9, 23, 24, 26, 32 n.8, 33 
n.13, 81, 86, 163 
Word-and-Paradigm model 83 
zero morpheme 84 

Multiple Feature Hypothesis 43ff 


N. 

n-deletion 24 

neutralization 13, 25, 26, 33 n.15, 46, 52, 
58ff, 63, 65, 68, 69, 74, 75 n.9, 96 
n.16, 125, 127, 141, 143, 147, 153 
incomplete 127, 141, 147, 153ff, 157, 
159, 163, 166ff 
initial neutralization 58ff 

non-linear phonology 2, Off 


oO. 

obstruents 1, 3, 4, 6ff, 13, 29, 42, 58, 63ff, 
81, 83, 87, 89, 91, 94, 125ff, 136ff, 
153ff 

onset (of syllable) 10, 13, 14, 17, 26ff, 33 
n.20, 33 n.22, 41, 43, 45, 51ff, 58ff, 
65ff, 71, 87, 91, 93, 94 

open syllable lengthening 24 

Optimality Theory 1ff, 12ff., 22ff, 34 n.29, 
83ff, 91ff 
constraint component Con 2 
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factorial typology 13, 18, 29, 30 
Generator component Gen 15 
local conjunction 1, 2, 17ff, 25ff 
richness of the base 26 
self-conjunction 1, 26, 27, 29, 31, 34 
n.26 

Optimality Theory: see also constraints 

orthography: see spelling 

Output-Output identity: see constraints: 
Ident-OO 


P. 

paradigms 9 
paradigm uniformity, levelling 23ff, 
82ff, 153ff 

past participle 8, 25, 155 

past tense 1, 8ff, 18ff, 32 n.8, 33 n.12, 33 
n.23, 85, 155 

perception 41, 43, 45, 99ff, 134, 146, 
153ff, 163, 167ff 

phoneme identification 104, 110ff 

phonetic correlates of [voice] 44, 127, 129 

phonetics-phonology interface 140 

places of articulation, voicing contrasts for 
different — 48, 50, 51, 53, 56, 60, 66, 
73, 75 n.3, 89, 95 n.3, 102, 104, 110, 
126, 160, 161 

plosives 3, 5, 8, 9, 16, 20, 26, 27, 30, 87, 
89, 90, 91, 99ff, 125ff, 153ff 

plural morpheme in English 14f 

posterior cricoarytenoid 143, 147 n.5 

postlexical phonology 32 n.12 

pre-aspiration 90 

prevoicing 41ff, 99ff, 126, 130 
absence 99, 103ff, 107, 117ff 
variation 104, 113, 121f 

Principles and Parameters framework 2, 9, 
12, 32 n.11, 33 n.12 

privative features: see binary vs. unary 
features 

prosodic phonology 18 

Prosodic Word 17, 18, 20 


R. 

rating experiment 153, 157, 158, 161, 
162ff 

regressive voice assimilation: see voicing 
assimilation 

release 43, 60, 99, 129ff, 142, 155, 159ff, 
167 


representations 41ff, 82, 84, 86, 88, 90, 93, 
94, 96 n.15, 105, 108, 128, 154, 158, 
168ff 
early lexical 65, 69ff 
prelexical 105, 118 

rules 1ff, 9, 10ff, 23, 24, 32 n.4, 32 n.5, 32 
n.9, 89, 95 n.11, 125f, 141, 144, 146, 
154, 169 
tule ordering 1, 3, 4, 7, 9, 126 


Ss. 

schwa 18, 20, 81, 85, 86, 95 n.6, 159 
schwa deletion 24, 85f, 95 n.6 
schwa insertion 30 

segmental duration 135f, 138 

short-long opposition: see geminates; 
vowel length 

Single Feature Hypothesis 44ff 

single-valued features: see binary vs. unary 
features 

Sonorant Voice 95n.11 

sonorants 3, 8, 20, 64, 65, 71, 83, 87, 94, 
95 n.11, 127, 132, 137, 142 

sonority 24, 28 

speech errors 70, 72, 73, 128 

spelling 4, 31 n.2, 34 n.29, 60, 127f, 147 
n.2, 154ff, 168f 

spontaneous voicing 144f 

[spread glottis] 41ff, 82, 90ff, 96 n.14 

Spread parameter 10 

subphonemic properties (of segments) 153, 
168 

suffixes 3, 7ff, 18, 20ff, 23, 32 n.6, 33 
n.21, 33 n.24, 81, 84, 93, 157 

syllable: see open syllable lengthening 

syllable structure 1, 4, 6, 7, 9, 10, 12, 14, 
15, 17, 18, 19, 24, 28, 34 n.28, 45, 46, 
55, 58, 70ff, 81ff, 105, 109, 113, 127, 
128 
resyllabification 9, 18, 30, 34 n.28, 
95 n.7 
syllabification 5, 18, 32 n.5, 32 n.9, 
33 n.24, 96 
trimoraic syllables 92, 94 


T. 

t-deletion 89 

tense-lax opposition 31 n.2, 81, 88 

theme vowel (in weak verbs) 9, 23, 32 n.9 

transcription 52, 53, 58, 60, 65, 75 n.6, 82, 
87, 147 n.1, 153, 157f, 167f, 170 

truncation 24 
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typology 2, 9, 12, 13, 13, 18, 22, 29, 30, 31 
n.1, 33 n.18, 41, 63, 91 


U. 
underspecification 11, 56, 73, 141ff 
unfaithfulness 62, 64, 73, 76 n.12, 76 n.17 


V. 
Vaux’s Law: see constraints 
velars 31, 48, 53, 61, 62, 66, 82, 95 n.3, 
100 
vocal folds 45, 99ff, 142ff, 154, 159ff 
voice onset time (VOT) 43ff, 54f, 59, 65f, 
72, 75 n.3, 99ff, 130f, 147 n.6 
voice tail: see voice termination time 
voice termination time (VTT) 134 
voicing assimilation 1ff, 89, 94, 125ff, 153 
linear analyses 1ff, 125f 
non-linear analyses 1, 2, 9ff 
OT analyses 12ff., 22ff, 34 n.29, 
83ff, 91ff 


progressive voicing assimilation 1, 
8ff, 89 
regressive voicing assimilation 1, 2, 
5, 10ff, 125ff, 153 
voicing errors: see errors in production 
voicing harmony 34 n.27, 57f, 67ff 
vowel duration: see vowel length 
vowel length 24, 88, 137, 146, 156, 159, 
160ff, 167 


W. 

word competitor 99, 100, 104, 106, 117ff, 
170 

word frequency 30, 42, 52, 73ff, 75 n.5, 
107ff, 169 

word recognition 99ff 
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